Post by Beaman, Thomas-----Original Message-----
Sent: Wednesday, July 13, 2016 10:16 AM
To: Beaman, Thomas
Subject: RE: oprofile support for PPC_E5500
Post by Michael PetlanSince the `perf list` shows stalled-cycles-frontend and
stalled-cycles-backend events, it seems it does not use the
e6500-pmu.c file, since these events are specified in e500-pmu.c only.
Tom, the question was whether WindRiver has some kernel patches for
e5500 support or not.
cat /proc/kallsyms | grep pmu | grep init
If you see e5500 there, your kernel probably has some
WindRiver-specific patch, since there is probably no init_e5500_pmu in
the upstream kernel.
Hi Michael,
There are no specific WindRiver patches that I can see for the e5500 and
there is no init_e5500 in kallsyms. How do you think we should proceed ?
Thanks,
Tom
c000000000059de8 t .fsl_emb_pmu_event_init
c00000000005acf0 t .init_e500_pmu
c00000000005ade4 t .init_e6500_pmu
c000000000c51f10 t __initcall_init_e500_pmuearly
c000000000c51f18 t __initcall_init_e6500_pmuearly
c000000000d3e480 d fsl_emb_pmu_event_init
c000000000d3e5b8 d init_e500_pmu
c000000000d3e5e8 d init_e6500_pmu
Hi Tom,
as William said, it would be nice to have the WindRiver oprofile patch posted
to the oprofile-list and incorporate it into upstream oprofile, in order to
eliminate the need of it. Could you please ask someone in WindRiver about
it?
I have filed a request with Windriver to submit their e6500 patch for review, and they have responded they will look into it and get back to me. I will let you know what they say.
Post by Beaman, ThomasThen, it would be nice to know the set of events supported by E5500 in order
to have correct events/ppc64/e5500/events file. In my patch, I supposed
that it has the same events as e6500, which is not probable. So it might either
have e500mc's events or it might have its own set.
Since we don't have the sufficienf knowledge, I am trying to approximate to
the correct set. For that, I asked you to test the CLK_FPU_DIV against some
program that performs lots of FPU divisions and L1_CACHE_* events against
something that might be more cache_stressing than the tested 'sleep 1'
command. You may use my sample program I use (attached).
As I wrote in some of the previous e-mails, the following from your
DL1_RELOADS,18240,100.00
IL1_FETCH_RELOADS,11714,100.00
L1_STASH_HIT,0,100.00
L1_STASH_REQ,0,100.00
L1_CACHE_MISSES,0,100.00
L1_CACHE_LOAD_MISSES,0,100.00
L1_CACHE_STORE_MISSES,0,100.00
While DL1_RELOADS and IL1_FETCH_RELOADS events give some non-zero
results, the L1_CACHE_MISSES, L1_CACHE_LOAD_MISSES and
L1_CACHE_STORE_MISSES are zero. This is probably wrong. Since the
L1_CACHE_MISSES event and the LOAD/STORE brothers are e6500-only, I
guess they are a sign that e5500 cannot have the e6500 event set. You can
prove this guess by checking whether you are able to get non-zero values
here with cache-stressing or not.
Sorry I was not able to get you this data sooner, I got tied up in other matters.
I compiled and ran your cache stress program and I think it proved your theory that the e5500 is not the same as the e6500 in this area. The results still show zero on the same L1 entries
./cache_stress 32384
Allocated 1061158912 bytes from addr 0xb8b43008
Doing 100000 memory operations.
Finished 100000 memory operations.
./check_events.sh
grep L1 results.log
DL1_RELOADS,18561,100.00
IL1_FETCH_RELOADS,10777,100.00
L1_STASH_HIT,0,100.00
L1_STASH_REQ,0,100.00
L1_CACHE_MISSES,0,100.00
L1_CACHE_LOAD_MISSES,0,100.00
L1_CACHE_STORE_MISSES,0,100.00
L1_CACHE_IM,0,100.00
Also (the other L1 events showed the same type of results)
ocount -e L1_STASH_HIT ./cache_stress 32384
Allocated 1061158912 bytes from addr 0xb8513008
Doing 100000 memory operations.
Finished 100000 memory operations.
Events were actively counted for 662947632 nanoseconds.
Event counts (actual) for /opt/XRX_IOT/miopqt/cache_stress:
Event Count % time counted
L1_STASH_HIT 0 100.00
Post by Beaman, ThomasIt should count cycles spent in fdivs or fdiv instructions. So I'd try to test it by
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char *argv[])
{
double a = atof(argv[1]);
double b = 3.14159265;
double c = a / b;
printf("res = %lf\n", c);
return 0;
}
gcc -o div -O0 -g div.c
objdump -d div
ocount -e CLK_FPU_DIV ./div 23445
Thanks.
Michael
objdump shows there is one fdiv instruction in the program
powerpc-wrs-linux-gnu-objdump -d div | grep -i fdiv
10000500: fc 0d 00 24 fdiv f0,f13,f0
***@miopqt_250:miopqt# ocount -e CLK_FPU_DIV ./div 23445
res = 7462.775290
Events were actively counted for 3338854 nanoseconds.
Event counts (actual) for /opt/XRX_IOT/miopqt/div:
Event Count % time counted
CLK_FPU_DIV 1 100.00
Let me know what over data I can help collect.
Thanks for your help on this,
Tom