Discussion:
[PATCH] Fix various i386 default unit masks (atom, nehalem, silvermont)
(too old to reply)
Michael Petlan
2016-12-10 11:18:48 UTC
Permalink
Raw Message
Hi,

I have created a static analysis tool, thanks to which I have
found some more non-unique default unit masks in the i386 CPU
configuration files. The attached patch fixes them. It should
apply on top of the master now.

Cheers,
Michael
William Cohen
2016-12-12 19:54:04 UTC
Permalink
Raw Message
Post by Michael Petlan
Hi,
I have created a static analysis tool, thanks to which I have
found some more non-unique default unit masks in the i386 CPU
configuration files. The attached patch fixes them. It should
apply on top of the master now.
Cheers,
Michael
Hi Michael,

Thanks for the patch. The changes look reasonable. However, noticed that a couple of the unit masks don't differ.

For the page_walks unit_mask there looks like there is a missing extra:cmask=edge for "Number of page-walks executed" otherwise the unit masks look identical.

--- a/events/i386/atom/unit_masks
+++ b/events/i386/atom/unit_masks
@@ -15,7 +15,7 @@ name:data_tlb_misses type:bitmask default:0x07
0x05 extra: dtlb_miss_ld DTLB misses due to load operations
0x09 extra: l0_dtlb_miss_ld L0_DTLB misses due to load operations
0x06 extra: dtlb_miss_st DTLB misses due to store operations
-name:page_walks type:bitmask default:0x03
+name:page_walks type:bitmask default:walks
0x03 extra: walks Number of page-walks executed
0x03 extra: cycles Duration of page-walks in core cycles
name:x87_comp_ops_exe type:bitmask default:0x81


The unit masks also look the same for any and stalled for nehalem. An extra: correction is needed here:

--- a/events/i386/nehalem/unit_masks
+++ b/events/i386/nehalem/unit_masks
@@ -37,7 +37,7 @@ name:mem_inst_retired type:bitmask default:0x01
0x02 extra: stores Counts the number of instructions with an architecturally-visible store retired on the architected path
name:mem_store_retired type:mandatory default:0x01
0x01 extra: dtlb_miss The event counts the number of retired stores that missed the DTLB
-name:uops_issued type:bitmask default:0x01
+name:uops_issued type:bitmask default:any
0x01 extra: any Counts the number of Uops issued by the Register Allocation Table to the Reservation Station, i
0x01 extra: stalled_cycles Counts the number of cycles no Uops issued by the Register Allocation Table to the Reservation Station, i
0x02 extra: fused Counts the number of fused Uops that were issued from the Register Allocation Table to the Reservation Station


-Will
Post by Michael Petlan
------------------------------------------------------------------------------
Developer Access Program for Intel Xeon Phi Processors
Access to Intel Xeon Phi processor-based developer platforms.
With one year of Intel Parallel Studio XE.
Training and support from Colfax.
Order your platform today.http://sdm.link/xeonphi
_______________________________________________
oprofile-list mailing list
https://lists.sourceforge.net/lists/listinfo/oprofile-list
Michael Petlan
2017-01-09 19:39:28 UTC
Permalink
Raw Message
Post by William Cohen
Post by Michael Petlan
Hi,
I have created a static analysis tool, thanks to which I have
found some more non-unique default unit masks in the i386 CPU
configuration files. The attached patch fixes them. It should
apply on top of the master now.
Cheers,
Michael
Hi Michael,
Thanks for the patch. The changes look reasonable. However, noticed that a couple of the unit masks don't differ.
Hi William,

I have fixed both the defects you found. Patch is attached.
I have tested it on Nehalem, the Atom 'extra:' fix was not
tested.

Thanks the pointing on it.

Michael
Post by William Cohen
For the page_walks unit_mask there looks like there is a missing extra:cmask=edge for "Number of page-walks executed" otherwise the unit masks look identical.
--- a/events/i386/atom/unit_masks
+++ b/events/i386/atom/unit_masks
@@ -15,7 +15,7 @@ name:data_tlb_misses type:bitmask default:0x07
0x05 extra: dtlb_miss_ld DTLB misses due to load operations
0x09 extra: l0_dtlb_miss_ld L0_DTLB misses due to load operations
0x06 extra: dtlb_miss_st DTLB misses due to store operations
-name:page_walks type:bitmask default:0x03
+name:page_walks type:bitmask default:walks
0x03 extra: walks Number of page-walks executed
0x03 extra: cycles Duration of page-walks in core cycles
name:x87_comp_ops_exe type:bitmask default:0x81
--- a/events/i386/nehalem/unit_masks
+++ b/events/i386/nehalem/unit_masks
@@ -37,7 +37,7 @@ name:mem_inst_retired type:bitmask default:0x01
0x02 extra: stores Counts the number of instructions with an architecturally-visible store retired on the architected path
name:mem_store_retired type:mandatory default:0x01
0x01 extra: dtlb_miss The event counts the number of retired stores that missed the DTLB
-name:uops_issued type:bitmask default:0x01
+name:uops_issued type:bitmask default:any
0x01 extra: any Counts the number of Uops issued by the Register Allocation Table to the Reservation Station, i
0x01 extra: stalled_cycles Counts the number of cycles no Uops issued by the Register Allocation Table to the Reservation Station, i
0x02 extra: fused Counts the number of fused Uops that were issued from the Register Allocation Table to the Reservation Station
-Will
William Cohen
2017-01-09 21:22:31 UTC
Permalink
Raw Message
Post by Michael Petlan
Post by William Cohen
Post by Michael Petlan
Hi,
I have created a static analysis tool, thanks to which I have
found some more non-unique default unit masks in the i386 CPU
configuration files. The attached patch fixes them. It should
apply on top of the master now.
Cheers,
Michael
Hi Michael,
Thanks for the patch. The changes look reasonable. However, noticed that a couple of the unit masks don't differ.
Hi William,
I have fixed both the defects you found. Patch is attached.
I have tested it on Nehalem, the Atom 'extra:' fix was not
tested.
Thanks the pointing on it.
Hi Michael,

Thanks for the updated version of the patch. The previous version of the patch was merged into the upstream and I attempted to put in a fix for the page walk, based on what was in other processors. Could you update the patch to apply to the upstream oprofile?

Note that "cmask=%x" (a hex number afterZ) and then comman seperated list of other flags. Thus, the following isn't right:

--- a/events/i386/atom/unit_masks
+++ b/events/i386/atom/unit_masks
@@ -15,8 +15,8 @@ name:data_tlb_misses type:bitmask default:0x07
0x05 extra: dtlb_miss_ld DTLB misses due to load operations
0x09 extra: l0_dtlb_miss_ld L0_DTLB misses due to load operations
0x06 extra: dtlb_miss_st DTLB misses due to store operations
-name:page_walks type:bitmask default:0x03
- 0x03 extra: walks Number of page-walks executed
+name:page_walks type:bitmask default:walks
+ 0x03 extra:cmask=edge walks Number of page-walks executed
0x03 extra: cycles Duration of page-walks in core cycles
name:x87_comp_ops_exe type:bitmask default:0x81
0x01 extra: s Floating point computational micro-ops executed

-Will
Post by Michael Petlan
Michael
Post by William Cohen
For the page_walks unit_mask there looks like there is a missing extra:cmask=edge for "Number of page-walks executed" otherwise the unit masks look identical.
--- a/events/i386/atom/unit_masks
+++ b/events/i386/atom/unit_masks
@@ -15,7 +15,7 @@ name:data_tlb_misses type:bitmask default:0x07
0x05 extra: dtlb_miss_ld DTLB misses due to load operations
0x09 extra: l0_dtlb_miss_ld L0_DTLB misses due to load operations
0x06 extra: dtlb_miss_st DTLB misses due to store operations
-name:page_walks type:bitmask default:0x03
+name:page_walks type:bitmask default:walks
0x03 extra: walks Number of page-walks executed
0x03 extra: cycles Duration of page-walks in core cycles
name:x87_comp_ops_exe type:bitmask default:0x81
--- a/events/i386/nehalem/unit_masks
+++ b/events/i386/nehalem/unit_masks
@@ -37,7 +37,7 @@ name:mem_inst_retired type:bitmask default:0x01
0x02 extra: stores Counts the number of instructions with an architecturally-visible store retired on the architected path
name:mem_store_retired type:mandatory default:0x01
0x01 extra: dtlb_miss The event counts the number of retired stores that missed the DTLB
-name:uops_issued type:bitmask default:0x01
+name:uops_issued type:bitmask default:any
0x01 extra: any Counts the number of Uops issued by the Register Allocation Table to the Reservation Station, i
0x01 extra: stalled_cycles Counts the number of cycles no Uops issued by the Register Allocation Table to the Reservation Station, i
0x02 extra: fused Counts the number of fused Uops that were issued from the Register Allocation Table to the Reservation Station
-Will
Michael Petlan
2017-01-10 20:21:27 UTC
Permalink
Raw Message
Post by William Cohen
Hi Michael,
Thanks for the updated version of the patch. The previous version of the patch was merged into the upstream and I attempted to put in a fix for the page walk, based on what was in other processors. Could you update the patch to apply to the upstream oprofile?
Hi Will,

The updated patch is attached. It fixes the remaining issue (Nehalem uops_issued).
I see. Sorry for that mistake, I took the "cmask=edge" string from one of your
prior mails and didn't realize that "edge" is not a valid cmask.

I hope, it's correct now.

Thank you!
Michael
Post by William Cohen
--- a/events/i386/atom/unit_masks
+++ b/events/i386/atom/unit_masks
@@ -15,8 +15,8 @@ name:data_tlb_misses type:bitmask default:0x07
0x05 extra: dtlb_miss_ld DTLB misses due to load operations
0x09 extra: l0_dtlb_miss_ld L0_DTLB misses due to load operations
0x06 extra: dtlb_miss_st DTLB misses due to store operations
-name:page_walks type:bitmask default:0x03
- 0x03 extra: walks Number of page-walks executed
+name:page_walks type:bitmask default:walks
+ 0x03 extra:cmask=edge walks Number of page-walks executed
0x03 extra: cycles Duration of page-walks in core cycles
name:x87_comp_ops_exe type:bitmask default:0x81
0x01 extra: s Floating point computational micro-ops executed
-Will
Post by Michael Petlan
Michael
Post by William Cohen
For the page_walks unit_mask there looks like there is a missing extra:cmask=edge for "Number of page-walks executed" otherwise the unit masks look identical.
--- a/events/i386/atom/unit_masks
+++ b/events/i386/atom/unit_masks
@@ -15,7 +15,7 @@ name:data_tlb_misses type:bitmask default:0x07
0x05 extra: dtlb_miss_ld DTLB misses due to load operations
0x09 extra: l0_dtlb_miss_ld L0_DTLB misses due to load operations
0x06 extra: dtlb_miss_st DTLB misses due to store operations
-name:page_walks type:bitmask default:0x03
+name:page_walks type:bitmask default:walks
0x03 extra: walks Number of page-walks executed
0x03 extra: cycles Duration of page-walks in core cycles
name:x87_comp_ops_exe type:bitmask default:0x81
--- a/events/i386/nehalem/unit_masks
+++ b/events/i386/nehalem/unit_masks
@@ -37,7 +37,7 @@ name:mem_inst_retired type:bitmask default:0x01
0x02 extra: stores Counts the number of instructions with an architecturally-visible store retired on the architected path
name:mem_store_retired type:mandatory default:0x01
0x01 extra: dtlb_miss The event counts the number of retired stores that missed the DTLB
-name:uops_issued type:bitmask default:0x01
+name:uops_issued type:bitmask default:any
0x01 extra: any Counts the number of Uops issued by the Register Allocation Table to the Reservation Station, i
0x01 extra: stalled_cycles Counts the number of cycles no Uops issued by the Register Allocation Table to the Reservation Station, i
0x02 extra: fused Counts the number of fused Uops that were issued from the Register Allocation Table to the Reservation Station
-Will
William Cohen
2017-01-11 18:20:16 UTC
Permalink
Raw Message
Post by Michael Petlan
Post by William Cohen
Hi Michael,
Thanks for the updated version of the patch. The previous version of the patch was merged into the upstream and I attempted to put in a fix for the page walk, based on what was in other processors. Could you update the patch to apply to the upstream oprofile?
Hi Will,
The updated patch is attached. It fixes the remaining issue (Nehalem uops_issued).
I see. Sorry for that mistake, I took the "cmask=edge" string from one of your
prior mails and didn't realize that "edge" is not a valid cmask.
I hope, it's correct now.
Thank you!
Michael
Hi Michael,

The patch looks fine and has been merged into the upstream oprofile git repository. -Will
Post by Michael Petlan
Post by William Cohen
--- a/events/i386/atom/unit_masks
+++ b/events/i386/atom/unit_masks
@@ -15,8 +15,8 @@ name:data_tlb_misses type:bitmask default:0x07
0x05 extra: dtlb_miss_ld DTLB misses due to load operations
0x09 extra: l0_dtlb_miss_ld L0_DTLB misses due to load operations
0x06 extra: dtlb_miss_st DTLB misses due to store operations
-name:page_walks type:bitmask default:0x03
- 0x03 extra: walks Number of page-walks executed
+name:page_walks type:bitmask default:walks
+ 0x03 extra:cmask=edge walks Number of page-walks executed
0x03 extra: cycles Duration of page-walks in core cycles
name:x87_comp_ops_exe type:bitmask default:0x81
0x01 extra: s Floating point computational micro-ops executed
-Will
Post by Michael Petlan
Michael
Post by William Cohen
For the page_walks unit_mask there looks like there is a missing extra:cmask=edge for "Number of page-walks executed" otherwise the unit masks look identical.
--- a/events/i386/atom/unit_masks
+++ b/events/i386/atom/unit_masks
@@ -15,7 +15,7 @@ name:data_tlb_misses type:bitmask default:0x07
0x05 extra: dtlb_miss_ld DTLB misses due to load operations
0x09 extra: l0_dtlb_miss_ld L0_DTLB misses due to load operations
0x06 extra: dtlb_miss_st DTLB misses due to store operations
-name:page_walks type:bitmask default:0x03
+name:page_walks type:bitmask default:walks
0x03 extra: walks Number of page-walks executed
0x03 extra: cycles Duration of page-walks in core cycles
name:x87_comp_ops_exe type:bitmask default:0x81
--- a/events/i386/nehalem/unit_masks
+++ b/events/i386/nehalem/unit_masks
@@ -37,7 +37,7 @@ name:mem_inst_retired type:bitmask default:0x01
0x02 extra: stores Counts the number of instructions with an architecturally-visible store retired on the architected path
name:mem_store_retired type:mandatory default:0x01
0x01 extra: dtlb_miss The event counts the number of retired stores that missed the DTLB
-name:uops_issued type:bitmask default:0x01
+name:uops_issued type:bitmask default:any
0x01 extra: any Counts the number of Uops issued by the Register Allocation Table to the Reservation Station, i
0x01 extra: stalled_cycles Counts the number of cycles no Uops issued by the Register Allocation Table to the Reservation Station, i
0x02 extra: fused Counts the number of fused Uops that were issued from the Register Allocation Table to the Reservation Station
-Will
------------------------------------------------------------------------------
Developer Access Program for Intel Xeon Phi Processors
Access to Intel Xeon Phi processor-based developer platforms.
With one year of Intel Parallel Studio XE.
Training and support from Colfax.
Order your platform today. http://sdm.link/xeonphi
_______________________________________________
oprofile-list mailing list
https://lists.sourceforge.net/lists/listinfo/oprofile-list
Loading...