Skip to content

Conversation

@KameronLloyd13
Copy link
Collaborator

Added documentation and built out NFIP claims function. Two questions for review: I believe the net... columns (net_building_payment_amount, net_contents_payment_amount) are the best values to use for amount paid to the person who submitted the claim but there are also amount_paid.. columns. The net_.. columns are consistently lower (with only a few exceptions) than amount_paid so my interpretation is they take into account the total of cashed / uncashed checks but wanted to double check if you interpret this the same way: https://www.fema.gov/openfema-data-page/fima-nfip-redacted-claims-v2.

I also saved a version of the full cleaned dataset to box since this one also takes a moment to run on its own. I tried to incorporate this into the function (to read from the cleaned dataset if it exists) but think I'm having some trouble getting the if statements layered in correctly. I added my attempt in the comments at the bottom.

Thanks for the review and feel free to let me know if you have any questions about the code or the above questions!

@wcurrangroome
Copy link
Collaborator

Added documentation and built out NFIP claims function. Two questions for review: I believe the net... columns (net_building_payment_amount, net_contents_payment_amount) are the best values to use for amount paid to the person who submitted the claim but there are also amount_paid.. columns. The net_.. columns are consistently lower (with only a few exceptions) than amount_paid so my interpretation is they take into account the total of cashed / uncashed checks but wanted to double check if you interpret this the same way: https://www.fema.gov/openfema-data-page/fima-nfip-redacted-claims-v2.

  • Yup, I'd anticipate the "net" variables are the ones we want. Post-shutdown, let's follow up and ask.

I also saved a version of the full cleaned dataset to box since this one also takes a moment to run on its own. I tried to incorporate this into the function (to read from the cleaned dataset if it exists) but think I'm having some trouble getting the if statements layered in correctly. I added my attempt in the comments at the bottom.

  • For me this all runs pretty quickly from the raw input file--can you try running on your end and if it takes under 1 min, let's leave as is. If it takes more then let's circle back to this.

@wcurrangroome
Copy link
Collaborator

wcurrangroome commented Oct 20, 2025

  • Can you translate the deductible codes into numeric values?
  • Can you collapse the occupancy_type variable into fewer categories? Something like single-family, 2-4 units, 5+ units, non-residential? If we lose a bit of detail by collapsing, that's fine, especially if some of the very specific codes only apply to a limited number of claims. (I think for "Single residential uit within a multi-unit building"--let's collapse this into our multifam category. And we can use the insured_unit_count value to get at the number of units per claim as needed.)

Copy link
Collaborator

@wcurrangroome wcurrangroome left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All looks good. Reading from disk is fastest approach. The api = TRUE path should work, but at the moment, the FEMA API seems to be down / malfunctioning.

@wcurrangroome wcurrangroome merged commit 408a6c4 into main Oct 30, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants