Small improvements to diff_netcdf_outputs.py, removing reliance on ncdiff, now it is entirely in python. Cleaning up linux_x86_64_nvhpc_gpu.bash, removing outdated parts, improving default parallel compilation, changing pgfortran to nvfortran. (details)
Small tweaks to fix some GPU bugs. Some variables were uninitialized on the CPU while we were saving them. This could only have been caught by comparing consecutive runs and checking _zt and _zm files, even then few cases were having problems. (details)
Fixing a labelling error in redirect_interpolated_azt_2D and similar procedures, since this interpolates to zt the input should be zm. I think this was my fault, so I cleaned all the zt2zm and zm2zt things up to make it a little nicer. Also ordered the routines _k _1D _2D to make it easier to jump around, it was a bit confusing as they were out of order and the typo really made it hard. (details)
bug fixes for the autocommit message maker code (details)
Making it so sclr_tol is set to 0 before the specified sclr_tol_in. This is so that it is initialized to 0 in the case that sclr_dim = 0, since now we are setting it to have a minimum allocation size of 1 and would otherwise have a garbage value. This is what broke the clubb_openmp_gfortran_test. (details)
This commit is a commit that changes absolutely nothing. It is meant to trigger a change in the git update scripts, so that I can start the commit message logging in the autocommit updates larson-group/sys_admin#797 (details)
Fixing an error with the autocommit_update script that was causing it new works (details)
this is another commit that cahnges nothing that will trigger the gitUpdate scripts (details)
Updates to make the convergence tests run on Anvil. (details)
adding an update that changes nothing and is just a test for the autoupdate script (details)
Making CLUBB's splatting scheme implicit and smoother (#1075) (details)
change to calc pressure to trigger autoupdate (details)
Editing convergence scripts to show that the directory should be placed in (details)
Committing scripts for use in running CLUBB convergence tests in the background. (details)
Updated the background run scripts for the convergence tests with a comment (details)
I updated the README to finish the section on the convergence tests. (details)
Edited the README section on CLUBB convergence tests. (details)
I added dycoms2_rf01 to the list of cases that could be run. (details)
I updated the run_cnvg_test_multi_cases_revall.csh script to include (details)
Modified run_cnvg_test_multi_cases_default.csh and (details)
Added comments to the script to explain ambiguous portions of my code (details)
GPUizing Lscale_width_vert_avg. Loops have been restructured for simplicity, and algorithm has a different starting value to avoid k dependency. Results are BFB. (#1083) (details)
GPUizing most of advance_clubb_core (#1084) (details)
advance_wp2_wp3 with explicitly managed memory (#1085) (details)
advance_xp2_xpyp with explicitly managed memory (#1086) (details)
advance_windm_edsclrm with explicitly managed memory (#1087) (details)
* Adding Skthl_zm to the update host list, I missed this in the last PR. I noticed this by comparing results with and without managed memory, now I've checked BFBness with arm, mpace_b, mc3e, and gabls2.
* Small GPU fixes (#1076)
* Fixing small things that I caught by adding the default(present) onto acc loops.
* Moving default(present) to the end because it looks nicer there.
* Adding default(present) to all acc loop statements. Also adding azt to a copyin statement, which was missed previously. All BFB.
* Incemental update, not well tested yet.
* Removing some copies and making the sclr_dim change.
* Fixing a bug that only seemed detectable with astex_a209. We need to pass only single arrays to functions, calling ddzt( nz, ngrdcol, gr, rho_ds_zt * K_zt_nu ) was resulting in rho_ds_zt * K_zt_nu being evluated on the CPU, but the values were only valid on the GPU. So we need to evaluate that expression on the GPU, save it into an array (currently K_zt_nu_tmp), then pass that to ddzt.
* GPUizing calc_turb_adv_range
* GPUizing mono_flux_limiter
* Cleaning up data statments and a couple other things.
* Updated for some different options.
* More updates needed for various options.
* Reverting accidental flag change
* Should be the final changes, all options tested now.
* Replacing some comments in monoflux limiter, and also modifying it to make it BFB on CPUs. Also changing incorrect error conditions on tridiag.
* Adding max_x_allowable to update host statement, missed previous.
* Properly naming tmp variables and variables calculated from ddzt and ddzm start with ddzt_ and ddzm_.
* Replacing constants with named ones from constants_clubb.
* Replacing hard coded numbers in lhs variables representing the number of bands they contain with fortran parameters.
Small improvements to diff_netcdf_outputs.py, removing reliance on ncdiff, now it is entirely in python. Cleaning up linux_x86_64_nvhpc_gpu.bash, removing outdated parts, improving default parallel compilation, changing pgfortran to nvfortran.
Small tweaks to fix some GPU bugs. Some variables were uninitialized on the CPU while we were saving them. This could only have been caught by comparing consecutive runs and checking _zt and _zm files, even then few cases were having problems.
Fixing a labelling error in redirect_interpolated_azt_2D and similar procedures, since this interpolates to zt the input should be zm. I think this was my fault, so I cleaned all the zt2zm and zm2zt things up to make it a little nicer. Also ordered the routines _k _1D _2D to make it easier to jump around, it was a bit confusing as they were out of order and the typo really made it hard.
Making it so sclr_tol is set to 0 before the specified sclr_tol_in. This is so that it is initialized to 0 in the case that sclr_dim = 0, since now we are setting it to have a minimum allocation size of 1 and would otherwise have a garbage value. This is what broke the clubb_openmp_gfortran_test.
* Making 2 new functions zm2zt2zm and zt2zm2zt to handle smoothing by interpolation. Replaced the spots in clubb I know that uses this to smooth things. This is just a nice to have and could allow for easy optimizations in the future by inlining the interpolations. All cases BFB on CPU and GPU, checked all relevant options too.
* GPUizing diagnose_Lscale_from_tau
* Removing some unused variables.
* Moving acc data statements from calc_Lscale_directly up to advance_clubb_core.
* Removing an unused variable.
* GPUizing the l_smooth_min_max option.
* GPUizing l_avg_Lscale
* Changes to variable names to avoid gross long names only used once.
* GPUizing pvertinterp even though I don't think we care about the l_do_expldiff_rtm_thlm flag
* Fixing bug. Setting l_do_expldiff_rtm_thlm causes us to use edsclrm, so we need to also ensure that edsclrm > 1 (1 because it uses a edsclr_dim-1 index)
This commit is a commit that changes absolutely nothing. It is meant to trigger a change in the git update scripts, so that I can start the commit message logging in the autocommit updates larson-group/sys_admin#797
* BIT_CHANGING:3b086a40085284aa49c71d32c001d20153a8ddb4 the last commit is bit changing for only some cases and only when using higher than -02 optimization. uf min seems to be the first calculation that starting to differ bitwise. Using the check_multicol script confirms the differences are small.
* Adding a tweak to surface values in the extra columns. This helped me check calc_sfc_varance, since we were not changing any arrays that would've affected calculations there.
* Small optimization, making wstar and ustar2 scalars.
* GPUizing calc_sfc_varnce
* Removing conditional around some stats calls. Now we will always save sfc values to stats, because this will change stats files when gr%zm(i,1) > sfc_elevation, this is potentially BIT_CHANGING.
* Merging with latest clubb changes and making work on GPUs again.
This contained 2 commits that are BIT_CHANGING in some situations.
Editing convergence scripts to show that the directory should be placed in scratch space, where there is plentiful room to run, given the size of the output files.
GPUizing Lscale_width_vert_avg. Loops have been restructured for simplicity, and algorithm has a different starting value to avoid k dependency. Results are BFB. (#1083)