Summary
- Refactored fill_holes_vertical to make GPUization simple. This is BIT_CHANGING, but results are bit-for-bit when using -O0 optimization, thus it is not answer changing. The first pass over each grid column will not parallelize well, the k-loop needs to be done in serial. Maximum parallelization has been exposed for the global hole-filling though, at the cost of occasionally doing unneccesary calculations. larson-group/clubb#972. (details)
- Removing fill_holes_multiplicative and replacing magic numbers with parameters from constants_clubb. larson-group/clubb#972 (details)
- Moving vertical_avg and vertical_integral to advance_helper_module. larson-group/clubb#972 (details)