-
Notifications
You must be signed in to change notification settings - Fork 239
Get rid of unneeded version checks #2765
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
kshyatt
wants to merge
1
commit into
master
Choose a base branch
from
ksh/cublas_version
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
ac8f3e1
to
ca96e09
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CUDA.jl Benchmarks
Benchmark suite | Current: ca96e09 | Previous: 5645ddc | Ratio |
---|---|---|---|
latency/precompile |
42854033083 ns |
48881022269.5 ns |
0.88 |
latency/ttfp |
7148038767 ns |
7292511536 ns |
0.98 |
latency/import |
3387489703 ns |
3494844990 ns |
0.97 |
integration/volumerhs |
9624264.5 ns |
9618776.5 ns |
1.00 |
integration/byval/slices=1 |
147256 ns |
147140 ns |
1.00 |
integration/byval/slices=3 |
425877 ns |
425273 ns |
1.00 |
integration/byval/reference |
145298 ns |
145049 ns |
1.00 |
integration/byval/slices=2 |
286513 ns |
286349 ns |
1.00 |
integration/cudadevrt |
103694 ns |
103459 ns |
1.00 |
kernel/indexing |
14436 ns |
14269 ns |
1.01 |
kernel/indexing_checked |
14861 ns |
14648 ns |
1.01 |
kernel/occupancy |
777.5090909090909 ns |
720.354609929078 ns |
1.08 |
kernel/launch |
2285.5555555555557 ns |
2197.3333333333335 ns |
1.04 |
kernel/rand |
18461 ns |
14675 ns |
1.26 |
array/reverse/1d |
19857 ns |
19825.5 ns |
1.00 |
array/reverse/2d |
24929 ns |
25322 ns |
0.98 |
array/reverse/1d_inplace |
10796 ns |
10664 ns |
1.01 |
array/reverse/2d_inplace |
12244 ns |
12288 ns |
1.00 |
array/copy |
21381 ns |
21417 ns |
1.00 |
array/iteration/findall/int |
158440 ns |
160202.5 ns |
0.99 |
array/iteration/findall/bool |
139170 ns |
140858 ns |
0.99 |
array/iteration/findfirst/int |
153951 ns |
153807 ns |
1.00 |
array/iteration/findfirst/bool |
154987 ns |
154741 ns |
1.00 |
array/iteration/scalar |
72845 ns |
72443 ns |
1.01 |
array/iteration/logical |
216325 ns |
219718.5 ns |
0.98 |
array/iteration/findmin/1d |
41346.5 ns |
41952 ns |
0.99 |
array/iteration/findmin/2d |
94176 ns |
94212 ns |
1.00 |
array/reductions/reduce/1d |
35853 ns |
36691.5 ns |
0.98 |
array/reductions/reduce/2d |
41462 ns |
49814 ns |
0.83 |
array/reductions/mapreduce/1d |
34029 ns |
34861 ns |
0.98 |
array/reductions/mapreduce/2d |
41368 ns |
41414 ns |
1.00 |
array/broadcast |
21097 ns |
20872 ns |
1.01 |
array/copyto!/gpu_to_gpu |
12972 ns |
13686 ns |
0.95 |
array/copyto!/cpu_to_gpu |
214298 ns |
211802 ns |
1.01 |
array/copyto!/gpu_to_cpu |
283270 ns |
245699 ns |
1.15 |
array/accumulate/1d |
109578 ns |
109690 ns |
1.00 |
array/accumulate/2d |
80665 ns |
80775 ns |
1.00 |
array/construct |
1281.8 ns |
1289.6 ns |
0.99 |
array/random/randn/Float32 |
43925 ns |
44847 ns |
0.98 |
array/random/randn!/Float32 |
25118 ns |
26739.5 ns |
0.94 |
array/random/rand!/Int64 |
27236 ns |
27115 ns |
1.00 |
array/random/rand!/Float32 |
8871 ns |
8873 ns |
1.00 |
array/random/rand/Int64 |
30222 ns |
38396 ns |
0.79 |
array/random/rand/Float32 |
13438 ns |
13462 ns |
1.00 |
array/permutedims/4d |
61819 ns |
61480 ns |
1.01 |
array/permutedims/2d |
55360 ns |
55970.5 ns |
0.99 |
array/permutedims/3d |
56269 ns |
56435 ns |
1.00 |
array/sorting/1d |
2766536 ns |
2776551 ns |
1.00 |
array/sorting/by |
3354483 ns |
3368660 ns |
1.00 |
array/sorting/2d |
1082927.5 ns |
1086044 ns |
1.00 |
cuda/synchronization/stream/auto |
1025.9 ns |
1062.2 ns |
0.97 |
cuda/synchronization/stream/nonblocking |
7550.6 ns |
6529.8 ns |
1.16 |
cuda/synchronization/stream/blocking |
798.6632653061224 ns |
804.8736842105263 ns |
0.99 |
cuda/synchronization/context/auto |
1182.1 ns |
1171.2 ns |
1.01 |
cuda/synchronization/context/nonblocking |
8394.6 ns |
6701.6 ns |
1.25 |
cuda/synchronization/context/blocking |
904.4 ns |
921.8888888888889 ns |
0.98 |
This comment was automatically generated by workflow using github-action-benchmark.
Hm, looks like the |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
According to the release notes, CUDA 11.4 ships with CUBLAS 11.6.5, so these checks for
version() < 11
aren't reachable.