Skip to content

Conversation

@mtbc
Copy link
Collaborator

@mtbc mtbc commented Feb 14, 2022

Reduces "if" conditions in Python code in an attempt to reduce register consumption or warp divergence.

In practice, I don't see any improvement, but opening this PR in case it forms a useful basis for further work.

for i in range(0, 5):
if vdots[i] < 0 and best_time > times[i]:
best_time = times[i]
side = i
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, here, if I try,

is_better = vdots[i] < 0 and best_time > times[i]
if is_better:
    best_time = times[i]
    side = i

then it's still (relatively) fast but with,

is_better = vdots[i] < 0 and best_time > times[i]
best_time = is_better and times[i] or best_time
side = is_better and i or side

it becomes extremely slow. Have you any tips, @ckendrick?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There might be an issue with how side is being set. I ran into an issue where I was setting the side to an incorrect value which caused neutrons to get stuck in the loop for all max_bounces number of iterations. If you set max_bounces = 100 and it is significantly faster, then this is probably the issue. Also, running test_guide.py in interactive mode should show differences if the side is incorrect.

Sorry I haven't opened a PR yet for the optimized tapered guide. The equivalent loop from mine is

i = 0
for k in range(4):
    side_hit = vdots[k] < 0.0
    t2 = times[k] / vdots[k]
    best_time = side_hit and t2 < t1
    i += best_time * int(k + 1)
    t1 = ((not best_time) * t1) + best_time * t2
if i == 0:
    break

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aha, thank you. Nice trick with using the numerical aspect of the truth value more directly than I.

And, yes, reducing max_bounces correspondingly speeds the run. Maybe my two alternatives aren't logically equivalent and I've yet to spot why!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ran into an issue where I was setting the side to an incorrect value ...

and so was I,

... my two alternatives aren't logically equivalent ...

so they weren't. 😃

Will push a followup commit but still doesn't improve performance significantly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants