Discussion:
OProfile-1.2.0-rc1
William Cohen
2017-06-19 15:08:20 UTC
Permalink
There have been a number of improvements checked into the OProfile git repo. It would be good to have those into an official release. I have made a release candidate and it would be good to test this out on various platforms.

URL:
https://sourceforge.net/projects/oprofile/files/oprofile/oprofile-1.2.0rc1/oprofile-1.2.0rc1.tar.gz/download

RELEASE NOTES:

New features
------------

- New/updated Processor Support
* ARM Cortex A17
* IBM Power 9
* IBM Power 8NV and NVL variants
* IBM z13
* Intel Goldmont
* Intel Kabylake
* Intel Xeon Phi (Knights Landing)
* Achitecture specific events for Applied Micro X-Gene

Bug fixes
---------

Filed bug reports:
-------------------------------------------------------------------------
| BUG ID | Summary
|-----------|------------------------------------------------------------
| 286 | Compilation error: left shift of negative value
| 288 | oprofile fails to build with --enable-pch and gcc-6.2
-------------------------------------------------------------------------

Other bug fixes and improvements without a filed report (e.g., posted to the lis
t):
---------------
- Fixed compile warning and errors when using GCC 6 or GCC 7
- Avoid using deprecated readdir_r function
- Store samples in the archive and search the appropriate places
for samples
- Only start the application if the perf events setup was successful


Known problems and limitations
-------------------------

- When using operf to profile multiple events, the absolute number of
events recorded may be substantially fewer than expected. This can
be due to known bug in the Linux kernel's Performance Events
Subsystem that was fixed sometime between Linux kernel version 3.1
and 3.5.

- Monitoring processes that frequently create and destroy threads via
the "--pid" option can be problematic. The pipes used within operf
and ocount may fill up can cause these programs to hang and require
multiple cntl-C to exit rather than successfully collecting data on
fast spawning processes and children.
William Cohen
2017-06-19 18:57:37 UTC
Permalink
I downloaded the RC1 tar ball and tested it on IBM Power 7, IBM Power8
big endian, IBM Power 8 little endian. I did some manual tests of a
known workload. I also downloaded and ran the OProfile test suite.
Everything looked fine.
Carl Love
Hi Carl,

Thanks so much testing on power. I also created some fedora rawhide rpms, https://koji.fedoraproject.org/koji/taskinfo?taskID=20066548
Going to try these out on various machine I have access to.

-Will
Post by William Cohen
There have been a number of improvements checked into the OProfile git repo. It would be good to have those into an official release. I have made a release candidate and it would be good to test this out on various platforms.
https://sourceforge.net/projects/oprofile/files/oprofile/oprofile-1.2.0rc1/oprofile-1.2.0rc1.tar.gz/download
New features
------------
- New/updated Processor Support
* ARM Cortex A17
* IBM Power 9
* IBM Power 8NV and NVL variants
* IBM z13
* Intel Goldmont
* Intel Kabylake
* Intel Xeon Phi (Knights Landing)
* Achitecture specific events for Applied Micro X-Gene
Bug fixes
---------
-------------------------------------------------------------------------
| BUG ID | Summary
|-----------|------------------------------------------------------------
| 286 | Compilation error: left shift of negative value
| 288 | oprofile fails to build with --enable-pch and gcc-6.2
-------------------------------------------------------------------------
Other bug fixes and improvements without a filed report (e.g., posted to the lis
---------------
- Fixed compile warning and errors when using GCC 6 or GCC 7
- Avoid using deprecated readdir_r function
- Store samples in the archive and search the appropriate places
for samples
- Only start the application if the perf events setup was successful
Known problems and limitations
-------------------------
- When using operf to profile multiple events, the absolute number of
events recorded may be substantially fewer than expected. This can
be due to known bug in the Linux kernel's Performance Events
Subsystem that was fixed sometime between Linux kernel version 3.1
and 3.5.
- Monitoring processes that frequently create and destroy threads via
the "--pid" option can be problematic. The pipes used within operf
and ocount may fill up can cause these programs to hang and require
multiple cntl-C to exit rather than successfully collecting data on
fast spawning processes and children.
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
oprofile-list mailing list
https://lists.sourceforge.net/lists/listinfo/oprofile-list
Michael Petlan
2017-06-20 12:09:55 UTC
Permalink
Hi William,

I have tested this tarball on a Knights Landing and everything looks good,
except that the testsuite still does not contain my KNL patch I guess.

However, even when I applied the patch to the testsuite, I got:
"Native configuration is x86_64-unknown-linux-gnu"
Is it somehow connected to the testsuite or is that related to the build?
It looks like the events I specified in the testsuite patch for KNL were
used in the test, so I expect it works.

I also ran selected Red Hat tests against it.

Cheers,
Michael
Post by William Cohen
There have been a number of improvements checked into the OProfile git repo. It would be good to have those into an official release. I have made a release candidate and it would be good to test this out on various platforms.
https://sourceforge.net/projects/oprofile/files/oprofile/oprofile-1.2.0rc1/oprofile-1.2.0rc1.tar.gz/download
New features
------------
- New/updated Processor Support
* ARM Cortex A17
* IBM Power 9
* IBM Power 8NV and NVL variants
* IBM z13
* Intel Goldmont
* Intel Kabylake
* Intel Xeon Phi (Knights Landing)
* Achitecture specific events for Applied Micro X-Gene
Bug fixes
---------
-------------------------------------------------------------------------
| BUG ID | Summary
|-----------|------------------------------------------------------------
| 286 | Compilation error: left shift of negative value
| 288 | oprofile fails to build with --enable-pch and gcc-6.2
-------------------------------------------------------------------------
Other bug fixes and improvements without a filed report (e.g., posted to the lis
---------------
- Fixed compile warning and errors when using GCC 6 or GCC 7
- Avoid using deprecated readdir_r function
- Store samples in the archive and search the appropriate places
for samples
- Only start the application if the perf events setup was successful
Known problems and limitations
-------------------------
- When using operf to profile multiple events, the absolute number of
events recorded may be substantially fewer than expected. This can
be due to known bug in the Linux kernel's Performance Events
Subsystem that was fixed sometime between Linux kernel version 3.1
and 3.5.
- Monitoring processes that frequently create and destroy threads via
the "--pid" option can be problematic. The pipes used within operf
and ocount may fill up can cause these programs to hang and require
multiple cntl-C to exit rather than successfully collecting data on
fast spawning processes and children.
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
oprofile-list mailing list
https://lists.sourceforge.net/lists/listinfo/oprofile-list
William Cohen
2017-06-20 14:42:51 UTC
Permalink
Post by Michael Petlan
Hi William,
I have tested this tarball on a Knights Landing and everything looks good,
except that the testsuite still does not contain my KNL patch I guess.
Hi Michael,

Thanks for testing.

Sorry about the missing Intel KNL oprofile-tests patch. This should be in the upstream oprofile-tests now.
Post by Michael Petlan
"Native configuration is x86_64-unknown-linux-gnu"
Is it somehow connected to the testsuite or is that related to the build?
It looks like the events I specified in the testsuite patch for KNL were
used in the test, so I expect it works.
The "Native configuration is x86_64-unknown-linux-gnu" output appears to be coming from runtest rather than oprofile testing itself. I am also seeing it in the systemtap testsuite output. It looks to be harmless.
Post by Michael Petlan
I also ran selected Red Hat tests against it.
Cheers,
Michael
Post by William Cohen
There have been a number of improvements checked into the OProfile git repo. It would be good to have those into an official release. I have made a release candidate and it would be good to test this out on various platforms.
https://sourceforge.net/projects/oprofile/files/oprofile/oprofile-1.2.0rc1/oprofile-1.2.0rc1.tar.gz/download
New features
------------
- New/updated Processor Support
* ARM Cortex A17
* IBM Power 9
* IBM Power 8NV and NVL variants
* IBM z13
* Intel Goldmont
* Intel Kabylake
* Intel Xeon Phi (Knights Landing)
* Achitecture specific events for Applied Micro X-Gene
Bug fixes
---------
-------------------------------------------------------------------------
| BUG ID | Summary
|-----------|------------------------------------------------------------
| 286 | Compilation error: left shift of negative value
| 288 | oprofile fails to build with --enable-pch and gcc-6.2
-------------------------------------------------------------------------
Other bug fixes and improvements without a filed report (e.g., posted to the lis
---------------
- Fixed compile warning and errors when using GCC 6 or GCC 7
- Avoid using deprecated readdir_r function
- Store samples in the archive and search the appropriate places
for samples
- Only start the application if the perf events setup was successful
Known problems and limitations
-------------------------
- When using operf to profile multiple events, the absolute number of
events recorded may be substantially fewer than expected. This can
be due to known bug in the Linux kernel's Performance Events
Subsystem that was fixed sometime between Linux kernel version 3.1
and 3.5.
- Monitoring processes that frequently create and destroy threads via
the "--pid" option can be problematic. The pipes used within operf
and ocount may fill up can cause these programs to hang and require
multiple cntl-C to exit rather than successfully collecting data on
fast spawning processes and children.
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
oprofile-list mailing list
https://lists.sourceforge.net/lists/listinfo/oprofile-list
大平怜
2017-06-22 19:53:33 UTC
Permalink
Hi,

FYI, I have submitted patches for this problem about a year ago. :-)
http://marc.info/?l=oprofile-list&m=146721380410945&w=2
Post by William Cohen
- Monitoring processes that frequently create and destroy threads via
Post by William Cohen
the "--pid" option can be problematic. The pipes used within operf
and ocount may fill up can cause these programs to hang and require
multiple cntl-C to exit rather than successfully collecting data on
fast spawning processes and children.
Regards,
Rei Odaira
Will Schmidt
2017-06-26 14:42:19 UTC
Permalink
Post by William Cohen
There have been a number of improvements checked into the OProfile git repo. It would be good to have those into an official release. I have made a release candidate and it would be good to test this out on various platforms.
https://sourceforge.net/projects/oprofile/files/oprofile/oprofile-1.2.0rc1/oprofile-1.2.0rc1.tar.gz/download
I grabbed and did a build/run across a small assortment of power
systems. (power7,power8LE,power8BE), test suite runs clean.
(Also looks good on pre-ga P9)
=== oprofile Summary ===

# of expected passes 78

One error in 'make check'. (also see this on my x86 laptop).
I suspect not a big deal, but in case... :-)

FAIL: load_events_files_tests

um zero is not used

A quick peek via gdb suggests this is occurring when trying to parse the
events files for CPU_CORE_I7, down the op_events(cpu_type) path. If I
force that test to pass via setting err=0 in gdb, it occurs again on
CPU_NEHALEM.

Breakpoint 1, op_events (cpu_type=CPU_CORE_I7) at op_events.c:724
724 {
$60 = CPU_CORE_I7
um zero is not used

Breakpoint 1, op_events (cpu_type=CPU_NEHALEM) at op_events.c:724
724 {
$63 = CPU_NEHALEM
um zero is not used


thanks
-Will (Schmidt)
Post by William Cohen
New features
------------
- New/updated Processor Support
* ARM Cortex A17
* IBM Power 9
* IBM Power 8NV and NVL variants
* IBM z13
* Intel Goldmont
* Intel Kabylake
* Intel Xeon Phi (Knights Landing)
* Achitecture specific events for Applied Micro X-Gene
Bug fixes
---------
-------------------------------------------------------------------------
| BUG ID | Summary
|-----------|------------------------------------------------------------
| 286 | Compilation error: left shift of negative value
| 288 | oprofile fails to build with --enable-pch and gcc-6.2
-------------------------------------------------------------------------
Other bug fixes and improvements without a filed report (e.g., posted to the lis
---------------
- Fixed compile warning and errors when using GCC 6 or GCC 7
- Avoid using deprecated readdir_r function
- Store samples in the archive and search the appropriate places
for samples
- Only start the application if the perf events setup was successful
Known problems and limitations
-------------------------
- When using operf to profile multiple events, the absolute number of
events recorded may be substantially fewer than expected. This can
be due to known bug in the Linux kernel's Performance Events
Subsystem that was fixed sometime between Linux kernel version 3.1
and 3.5.
- Monitoring processes that frequently create and destroy threads via
the "--pid" option can be problematic. The pipes used within operf
and ocount may fill up can cause these programs to hang and require
multiple cntl-C to exit rather than successfully collecting data on
fast spawning processes and children.
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
oprofile-list mailing list
https://lists.sourceforge.net/lists/listinfo/oprofile-list
William Cohen
2017-06-28 15:53:27 UTC
Permalink
Post by Will Schmidt
Post by William Cohen
There have been a number of improvements checked into the OProfile git repo. It would be good to have those into an official release. I have made a release candidate and it would be good to test this out on various platforms.
https://sourceforge.net/projects/oprofile/files/oprofile/oprofile-1.2.0rc1/oprofile-1.2.0rc1.tar.gz/download
I grabbed and did a build/run across a small assortment of power
systems. (power7,power8LE,power8BE), test suite runs clean.
(Also looks good on pre-ga P9)
=== oprofile Summary ===
# of expected passes 78
One error in 'make check'. (also see this on my x86 laptop).
I suspect not a big deal, but in case... :-)
FAIL: load_events_files_tests
um zero is not used
A quick peek via gdb suggests this is occurring when trying to parse the
events files for CPU_CORE_I7, down the op_events(cpu_type) path. If I
force that test to pass via setting err=0 in gdb, it occurs again on
CPU_NEHALEM.
Breakpoint 1, op_events (cpu_type=CPU_CORE_I7) at op_events.c:724
724 {
$60 = CPU_CORE_I7
um zero is not used
Breakpoint 1, op_events (cpu_type=CPU_NEHALEM) at op_events.c:724
724 {
$63 = CPU_NEHALEM
um zero is not used
thanks
-Will (Schmidt)
Hi Will,

I took a look at this failure from "make check". This is due to the events/i386/arch_perfmon/unit_masks having the unit mask zero, but it not being used in nahelem for the events. i386/nehalem/events has:

#event:0x3c counters:0,1,2,3 um:zero minimum:6000 name:CPU_CLK_UNHALTED : Clock cycles when not halted
event:0x3c counters:0,1,2,3 um:one minimum:6000 name:UNHALTED_REFERENCE_CYCLES : Unhalted reference cycles

The nahelem events are included in the i386/core_i7 events. Thus, it also suffers from the same problem. This appears to be relatively harmless. However, it does cause the load_events_files_tests test to abort so processors after core_i7 (including nehalem) are not check for event sanity. I manually set err=0 after each of the reports of the unused mask and didn't see any other problems.

-Will Cohen
William Cohen
2017-06-28 16:08:54 UTC
Permalink
Post by William Cohen
Post by Will Schmidt
Post by William Cohen
There have been a number of improvements checked into the OProfile git repo. It would be good to have those into an official release. I have made a release candidate and it would be good to test this out on various platforms.
https://sourceforge.net/projects/oprofile/files/oprofile/oprofile-1.2.0rc1/oprofile-1.2.0rc1.tar.gz/download
I grabbed and did a build/run across a small assortment of power
systems. (power7,power8LE,power8BE), test suite runs clean.
(Also looks good on pre-ga P9)
=== oprofile Summary ===
# of expected passes 78
One error in 'make check'. (also see this on my x86 laptop).
I suspect not a big deal, but in case... :-)
FAIL: load_events_files_tests
um zero is not used
A quick peek via gdb suggests this is occurring when trying to parse the
events files for CPU_CORE_I7, down the op_events(cpu_type) path. If I
force that test to pass via setting err=0 in gdb, it occurs again on
CPU_NEHALEM.
Breakpoint 1, op_events (cpu_type=CPU_CORE_I7) at op_events.c:724
724 {
$60 = CPU_CORE_I7
um zero is not used
Breakpoint 1, op_events (cpu_type=CPU_NEHALEM) at op_events.c:724
724 {
$63 = CPU_NEHALEM
um zero is not used
thanks
-Will (Schmidt)
Hi Will,
#event:0x3c counters:0,1,2,3 um:zero minimum:6000 name:CPU_CLK_UNHALTED : Clock cycles when not halted
event:0x3c counters:0,1,2,3 um:one minimum:6000 name:UNHALTED_REFERENCE_CYCLES : Unhalted reference cycles
The nahelem events are included in the i386/core_i7 events. Thus, it also suffers from the same problem. This appears to be relatively harmless. However, it does cause the load_events_files_tests test to abort so processors after core_i7 (including nehalem) are not check for event sanity. I manually set err=0 after each of the reports of the unused mask and didn't see any other problems.
-Will Cohen
The attached patch should address the test failure. Does that seem like a reasonable fix? -Will
Will Schmidt
2017-06-28 18:29:19 UTC
Permalink
Post by William Cohen
Post by William Cohen
Post by Will Schmidt
Post by William Cohen
There have been a number of improvements checked into the OProfile git repo. It would be good to have those into an official release. I have made a release candidate and it would be good to test this out on various platforms.
https://sourceforge.net/projects/oprofile/files/oprofile/oprofile-1.2.0rc1/oprofile-1.2.0rc1.tar.gz/download
I grabbed and did a build/run across a small assortment of power
systems. (power7,power8LE,power8BE), test suite runs clean.
(Also looks good on pre-ga P9)
=== oprofile Summary ===
# of expected passes 78
One error in 'make check'. (also see this on my x86 laptop).
I suspect not a big deal, but in case... :-)
FAIL: load_events_files_tests
um zero is not used
A quick peek via gdb suggests this is occurring when trying to parse the
events files for CPU_CORE_I7, down the op_events(cpu_type) path. If I
force that test to pass via setting err=0 in gdb, it occurs again on
CPU_NEHALEM.
Breakpoint 1, op_events (cpu_type=CPU_CORE_I7) at op_events.c:724
724 {
$60 = CPU_CORE_I7
um zero is not used
Breakpoint 1, op_events (cpu_type=CPU_NEHALEM) at op_events.c:724
724 {
$63 = CPU_NEHALEM
um zero is not used
thanks
-Will (Schmidt)
Hi Will,
#event:0x3c counters:0,1,2,3 um:zero minimum:6000 name:CPU_CLK_UNHALTED : Clock cycles when not halted
event:0x3c counters:0,1,2,3 um:one minimum:6000 name:UNHALTED_REFERENCE_CYCLES : Unhalted reference cycles
The nahelem events are included in the i386/core_i7 events. Thus, it also suffers from the same problem. This appears to be relatively harmless. However, it does cause the load_events_files_tests test to abort so processors after core_i7 (including nehalem) are not check for event sanity. I manually set err=0 after each of the reports of the unused mask and didn't see any other problems.
-Will Cohen
The attached patch should address the test failure. Does that seem like a reasonable fix? -Will
Looks good to me. fixes the problem, at least. :-)


PASS: cpu_type_tests
PASS: parse_event_tests
PASS: load_events_files_tests
PASS: alloc_counter_tests
PASS: mangle_tests
PASS: utf8_checker.sh
============================================================================
Testsuite summary for OProfile 1.2.0rc1
============================================================================
# TOTAL: 6
# PASS: 6
# SKIP: 0
# XFAIL: 0
# FAIL: 0
# XPASS: 0
# ERROR: 0
William Cohen
2017-07-07 20:02:02 UTC
Permalink
Due to some fixes for "make check" made oprofile-1.2.0-rc2.tar.gz. Proposed fixes to address the rapid thread spawning are not this release.

URL:
https://sourceforge.net/projects/oprofile/files/oprofile/oprofile-1.2.0rc2/oprofile-1.2.0rc2.tar.gz/download

RELEASE NOTES:


OProfile is a powerful system-wide profiler for Linux.
Read more at http://oprofile.sf.net

OProfile 1.2.0 has been released.

New features
------------

- New/updated Processor Support
* ARM Cortex A17
* IBM Power 9
* IBM Power 8NV and NVL variants
* IBM z13
* Intel Goldmont
* Intel Kabylake
* Intel Xeon Phi (Knights Landing)
* Achitecture specific events for Applied Micro X-Gene

Bug fixes
---------

Filed bug reports:
-------------------------------------------------------------------------
| BUG ID | Summary
|-----------|------------------------------------------------------------
| 286 | Compilation error: left shift of negative value
| 288 | oprofile fails to build with --enable-pch and gcc-6.2
-------------------------------------------------------------------------

Other bug fixes and improvements without a filed report (e.g., posted to the lis
t):
---------------
- Fixed compile warning and errors when using GCC 6 or GCC 7
- Avoid using deprecated readdir_r function
- Store samples in the archive and search the appropriate places
for samples
- Only start the application if the perf events setup was successful
- Corrections in the code and i386 events so "make check" tests pass


Known problems and limitations
-------------------------

- When using operf to profile multiple events, the absolute number of
events recorded may be substantially fewer than expected. This can
be due to known bug in the Linux kernel's Performance Events
Subsystem that was fixed sometime between Linux kernel version 3.1
and 3.5.

- Monitoring processes that frequently create and destroy threads via
the "--pid" option can be problematic. The pipes used within operf
and ocount may fill up can cause these programs to hang and require
multiple cntl-C to exit rather than successfully collecting data on
fast spawning processes and children.

Loading...