import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import folium
from folium.plugins import MarkerCluster
from branca.colormap import LinearColormap
from geopy.distance import geodesic
pd.set_option('display.max_columns', None)Geographic Analysis of Charity Donors - Latest Leaflet Maps
Python
EH
An anonymised display of voluntary work conducted for Emmanuel House Support Centre. Post 3/5 in the series.
df = pd.read_csv('data.csv').drop('DistanceFromEHSC', axis=1)df.head()| Latitude | Longitude | Newsletter | Transactions_LifetimeGiftsAmount | Transactions_LifetimeGiftsNumber | Transactions_AverageGiftAmount | DonationFrequency | DonationFrequencyActive | Transactions_Months1To12GiftsAmount | Transactions_Months1To12GiftsNumber | monthlyDonorMonths1to12 | Transactions_Months13To24GiftsAmount | Transactions_Months13To24GiftsNumber | monthlyDonorMonths13to24 | Transactions_Months25To36GiftsAmount | Transactions_Months25To36GiftsNumber | monthlyDonorMonths25to36 | Transactions_DateOfFirstGift | Transactions_FirstGiftAmount | Transactions_DateOfLastGift | Transactions_LastGiftAmount | monthsSinceFirstDonation | monthsSinceLastDonation | activeMonths | Transactions_DateOfHighestGift | Transactions_HighestGiftAmount | Transactions_DateOfLowestGift | Transactions_LowestGiftAmount | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 52.961066 | -1.205200 | 1 | 419.66 | 4 | 104.915000 | 0.133333 | 0.148148 | 152.89 | 1 | 0 | 154.77 | 1 | 0 | 97.19 | 1 | 0 | 2021-10-23 | 96.09 | 2023-12-31 | 148.05 | 29 | 3 | 27 | 2024-01-01 | 148.44 | 2021-10-20 | 99.05 |
| 1 | 52.924105 | -1.216433 | 1 | 111.05 | 3 | 37.016667 | 0.103448 | 0.230769 | 0.00 | 0 | 0 | 50.43 | 1 | 0 | 52.93 | 1 | 0 | 2021-11-26 | 48.27 | 2022-11-17 | 50.97 | 28 | 16 | 13 | 2022-11-21 | 48.67 | 2022-11-19 | 49.69 |
| 2 | 52.936510 | -1.127547 | 1 | 982.82 | 11 | 89.347273 | 0.297297 | 1.000000 | 0.00 | 0 | 0 | 0.00 | 0 | 0 | 901.00 | 10 | 0 | 2021-03-30 | 97.38 | 2022-01-13 | 103.32 | 36 | 26 | 11 | 2022-01-12 | 97.97 | 2022-01-13 | 102.72 |
| 3 | 52.997952 | -1.189854 | 1 | 20.45 | 2 | 10.225000 | 0.117647 | 2.000000 | 0.00 | 0 | 0 | 10.45 | 1 | 0 | 0.00 | 0 | 0 | 2022-12-06 | 10.03 | 2022-12-06 | 9.92 | 16 | 16 | 1 | 2022-12-06 | 9.79 | 2022-12-03 | 9.94 |
| 4 | 52.971756 | -1.203969 | 1 | 20.75 | 2 | 10.375000 | 0.074074 | 2.000000 | 0.00 | 0 | 0 | 0.00 | 0 | 0 | 10.75 | 1 | 0 | 2022-01-20 | 10.52 | 2022-01-20 | 9.68 | 26 | 26 | 1 | 2022-01-23 | 9.64 | 2022-01-20 | 10.02 |
The Data
The fictional donors in data.csv have the following features:
| Feature | Description |
|---|---|
| Latitude | The approximate latitude of the donor. |
| Longitude | The approximate longitude of the donor. |
| Newsletter | A binary indicator (1 or 0) representing whether the donor is subscribed to the Charity’s email newsletter. |
| Transactions_LifetimeGiftsAmount | The total amount of donations made by the donor over their lifetime. |
| Transactions_LifetimeGiftsNumber | The total number of donations made by the donor over their lifetime. |
| Transactions_AverageGiftAmount | The average donation made by the donor. |
| DonationFrequency | The frequency of donations made by the donor (donations/month). |
| DonationFrequencyActive | The frequency of donations made by the donor while they were active (donations/month). |
| Transactions_Months1To12GiftsAmount | The total amount of donations made by the donor in the latest 12 months. |
| monthlyDonorMonths1to12 | A binary indicator showing if the donor made monthly donations in the latest 12 months. |
| Transactions_Months13To24GiftsAmount | The total amount of donations made by the donor in the latest 13-24 months. |
| Transactions_Months13To24GiftsNumber | The total number of donations made by the donor in the latest 13-24 months. |
| monthlyDonorMonths13to24 | A binary indicator showing if the donor made monthly donations in the latest 13-24 months. |
| Transactions_Months25To36GiftsAmount | The total amount of donations made by the donor in the latest 25-36 months. |
| Transactions_Months25To36GiftsNumber | The total number of donations made by the donor in the latest 25-36 months. |
| monthlyDonorMonths25to36 | A binary indicator showing if the donor made monthly donations in the latest 25-36 months. |
| Transactions_DateOfFirstGift | The date of the first donation made by the donor. |
| Transactions_FirstGiftAmount | The amount of the first donation made by the donor. |
| Transactions_DateOfLastGift | The date of the last donation made by the donor. |
| Transactions_LastGiftAmount | The amount of the last donation made by the donor. |
| monthsSinceFirstDonation | The number of months since the donor’s first donation. |
| monthsSinceLastDonation | The number of months since the donor’s last donation. |
| activeMonths | The number of months the donor has been an active supported of the Charity. |
| Transactions_DateOfHighestGift | The date of the highest donation made by the donor. |
| Transactions_HighestGiftAmount | The amount of the highest donation made by the donor. |
| Transactions_DateOfLowestGift | The date of the lowest donation made by the donor. |
| Transactions_LowestGiftAmount | The amount of the lowest donation made by the donor. |
Note that these data points have been randomized, permuted and anonymized. This is not the true data of real donors to the charity.
df.sample(10)| Latitude | Longitude | Newsletter | Transactions_LifetimeGiftsAmount | Transactions_LifetimeGiftsNumber | Transactions_AverageGiftAmount | DonationFrequency | DonationFrequencyActive | Transactions_Months1To12GiftsAmount | Transactions_Months1To12GiftsNumber | monthlyDonorMonths1to12 | Transactions_Months13To24GiftsAmount | Transactions_Months13To24GiftsNumber | monthlyDonorMonths13to24 | Transactions_Months25To36GiftsAmount | Transactions_Months25To36GiftsNumber | monthlyDonorMonths25to36 | Transactions_DateOfFirstGift | Transactions_FirstGiftAmount | Transactions_DateOfLastGift | Transactions_LastGiftAmount | monthsSinceFirstDonation | monthsSinceLastDonation | activeMonths | Transactions_DateOfHighestGift | Transactions_HighestGiftAmount | Transactions_DateOfLowestGift | Transactions_LowestGiftAmount | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 425 | 52.976130 | -1.142525 | 0 | 148.69 | 2 | 74.345000 | 0.054054 | 2.000000 | 0.00 | 0 | 0 | 0.00 | 0 | 0 | 73.69 | 1 | 0 | 2021-03-30 | 76.11 | 2021-04-01 | 74.40 | 36 | 36 | 1 | 2021-04-03 | 75.72 | 2021-04-03 | 75.02 |
| 39 | 52.974401 | -1.105213 | 1 | 19.90 | 2 | 9.950000 | 0.071429 | 2.000000 | 0.00 | 0 | 0 | 0.00 | 0 | 0 | 9.90 | 1 | 0 | 2021-12-15 | 9.73 | 2021-12-15 | 10.00 | 27 | 27 | 1 | 2021-12-16 | 10.15 | 2021-12-19 | 9.91 |
| 13 | 52.911530 | -1.107064 | 0 | 198.84 | 3 | 66.280000 | 0.250000 | 3.000000 | 98.84 | 2 | 0 | 0.00 | 0 | 0 | 0.00 | 0 | 0 | 2023-04-23 | 49.53 | 2023-04-23 | 50.55 | 11 | 11 | 1 | 2023-04-27 | 48.94 | 2023-04-23 | 50.41 |
| 227 | 52.925344 | -1.259633 | 1 | 50.17 | 2 | 25.085000 | 0.153846 | 2.000000 | 25.17 | 1 | 0 | 0.00 | 0 | 0 | 0.00 | 0 | 0 | 2023-03-29 | 24.72 | 2023-03-30 | 24.49 | 12 | 12 | 1 | 2023-03-26 | 24.75 | 2023-03-28 | 24.65 |
| 310 | 52.952328 | -1.159730 | 1 | 337.34 | 35 | 9.638286 | 0.972222 | 1.000000 | 105.45 | 11 | 0 | 125.94 | 13 | 1 | 96.48 | 10 | 0 | 2021-05-01 | 9.94 | 2024-02-28 | 9.87 | 35 | 1 | 35 | 2024-02-29 | 9.62 | 2024-02-27 | 9.32 |
| 563 | 52.970545 | -1.136419 | 1 | 323.56 | 16 | 20.222500 | 0.444444 | 1.066667 | 0.00 | 0 | 0 | 83.93 | 4 | 0 | 219.63 | 11 | 0 | 2021-04-30 | 20.04 | 2022-06-30 | 19.66 | 35 | 21 | 15 | 2022-06-26 | 20.73 | 2022-06-29 | 21.06 |
| 302 | 52.986120 | -1.147062 | 0 | 1794.31 | 24 | 74.762917 | 0.648649 | 1.043478 | 0.00 | 0 | 0 | 912.18 | 12 | 1 | 807.13 | 11 | 0 | 2021-04-02 | 71.64 | 2023-01-30 | 77.85 | 36 | 14 | 23 | 2023-01-30 | 73.85 | 2023-02-01 | 74.46 |
| 37 | 52.990988 | -1.137386 | 1 | 231.41 | 4 | 57.852500 | 1.333333 | 2.000000 | 156.41 | 3 | 0 | 0.00 | 0 | 0 | 0.00 | 0 | 0 | 2024-01-03 | 51.10 | 2024-02-27 | 48.58 | 2 | 1 | 2 | 2024-03-02 | 50.08 | 2024-02-28 | 50.89 |
| 398 | 52.940981 | -1.214600 | 1 | 93.46 | 18 | 5.192222 | 0.900000 | 1.000000 | 52.85 | 10 | 0 | 35.89 | 7 | 0 | 0.00 | 0 | 0 | 2022-08-27 | 5.05 | 2024-01-02 | 5.05 | 19 | 2 | 18 | 2024-01-04 | 5.17 | 2023-12-31 | 4.99 |
| 95 | 52.964628 | -1.149133 | 1 | 86.08 | 16 | 5.380000 | 0.444444 | 0.444444 | 63.50 | 12 | 1 | 10.05 | 2 | 0 | 10.17 | 1 | 0 | 2021-04-30 | 10.22 | 2024-03-08 | 4.84 | 35 | 0 | 36 | 2021-05-01 | 10.12 | 2024-03-05 | 4.95 |
df.info()<class 'pandas.core.frame.DataFrame'>
RangeIndex: 750 entries, 0 to 749
Data columns (total 28 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Latitude 750 non-null float64
1 Longitude 750 non-null float64
2 Newsletter 750 non-null int64
3 Transactions_LifetimeGiftsAmount 750 non-null float64
4 Transactions_LifetimeGiftsNumber 750 non-null int64
5 Transactions_AverageGiftAmount 750 non-null float64
6 DonationFrequency 750 non-null float64
7 DonationFrequencyActive 750 non-null float64
8 Transactions_Months1To12GiftsAmount 750 non-null float64
9 Transactions_Months1To12GiftsNumber 750 non-null int64
10 monthlyDonorMonths1to12 750 non-null int64
11 Transactions_Months13To24GiftsAmount 750 non-null float64
12 Transactions_Months13To24GiftsNumber 750 non-null int64
13 monthlyDonorMonths13to24 750 non-null int64
14 Transactions_Months25To36GiftsAmount 750 non-null float64
15 Transactions_Months25To36GiftsNumber 750 non-null int64
16 monthlyDonorMonths25to36 750 non-null int64
17 Transactions_DateOfFirstGift 750 non-null object
18 Transactions_FirstGiftAmount 750 non-null float64
19 Transactions_DateOfLastGift 750 non-null object
20 Transactions_LastGiftAmount 750 non-null float64
21 monthsSinceFirstDonation 750 non-null int64
22 monthsSinceLastDonation 750 non-null int64
23 activeMonths 750 non-null int64
24 Transactions_DateOfHighestGift 750 non-null object
25 Transactions_HighestGiftAmount 750 non-null float64
26 Transactions_DateOfLowestGift 750 non-null object
27 Transactions_LowestGiftAmount 750 non-null float64
dtypes: float64(13), int64(11), object(4)
memory usage: 164.2+ KB
Converting the columns containing dates to datetime type
date_cols = ['Transactions_DateOfFirstGift', 'Transactions_DateOfLowestGift', 'Transactions_DateOfHighestGift', 'Transactions_DateOfLastGift']
for col in date_cols:
df[col] = pd.to_datetime(df[col])ENGINEERING DISTANCE FROM EHSC
def calculate_distance(row, base_coords):
return geodesic((row['Latitude'], row['Longitude']), base_coords).kmehsc_coords = (52.95383, -1.14168)
df['DistanceFromEHSC'] = df.apply(calculate_distance, axis=1, base_coords=ehsc_coords)df['DistanceFromEHSC']0 4.344046
1 6.016685
2 2.148884
3 5.880628
4 4.636596
...
745 2.801735
746 3.900813
747 6.047917
748 3.728712
749 3.338494
Name: DistanceFromEHSC, Length: 750, dtype: float64
df['DistanceFromEHSC'].plot()
df['DistanceFromEHSC'].describe()count 750.000000
mean 4.603878
std 2.307654
min 0.096575
25% 2.821600
50% 4.273026
75% 6.219849
max 9.978092
Name: DistanceFromEHSC, dtype: float64
df.to_csv('data.csv', index=False)VISUALISING DISTRIBUTION OF DONORS WITH FOLIUM
m = folium.Map(location=[52.9548, -1.1581], zoom_start=12)
colors = ['green', 'yellow', 'orange', 'red', 'purple']
linear_colormap = LinearColormap(colors=colors,
index=[0, 100, 250, 500, 1000],
vmin=df['Transactions_LifetimeGiftsAmount'].min(),
vmax=df['Transactions_LifetimeGiftsAmount'].quantile(0.94))
# Create FeatureGroups
fgroups = [folium.map.FeatureGroup(name=f"Total Donated: £{lower}{('-£' + str(upper)) if upper != float('inf') else '+'}") for lower, upper in zip([0, 100, 250, 500, 750, 1000], [100, 250, 500, 750, 1000, float('inf')])]
for index, row in df.iterrows():
fname = 'Example'
lname = 'Donor'
email = 'exampledonor@email.com'
total_don = row['Transactions_LifetimeGiftsAmount']
num_don = row['Transactions_LifetimeGiftsNumber']
avg_don = row['Transactions_AverageGiftAmount']
news = bool(row['Newsletter'])
monthly = bool(row['monthlyDonorMonths1to12'])
lat = row['Latitude']
long = row['Longitude']
dateoffirst = row['Transactions_DateOfFirstGift'].strftime('%d/%m/%Y')
dateoflast = row['Transactions_DateOfLastGift'].strftime('%d/%m/%Y')
active = row['activeMonths']
freq = row['DonationFrequency']
freq_active = row['DonationFrequencyActive']
dist = row['DistanceFromEHSC']
popup_text = f'''
<div style="width: 200px; font-family: Arial; line-height: 1.2;">
<h4 style="margin-bottom: 5px;">{fname} {lname}</h4>
<p style="margin: 0;"><b>Total Donated:</b> £{total_don:.2f}</p>
<p style="margin: 0;"><b>Number of Donations:</b> {num_don}</p>
<p style="margin: 0;"><b>Average Donation:</b> £{avg_don:.2f}</p>
<br>\
<p style="margin: 0;"><b>First Recorded Donation:</b> {dateoffirst}</p>
<p style="margin: 0;"><b>Last Recorded Donation:</b> {dateoflast}</p>
<br>\
<p style="margin: 0;"><b>ActiveMonths:</b> {active}</p>
<p style="margin: 0;"><b>DonationFrequency</b> {freq:.2f}</p>
<p style="margin: 0;"><b>DonationFrequencyActive</b> {freq_active:.2f}</p>
<br>\
<p style="margin: 0;"><b>Subscribed to Newsletter:</b> {"Yes" if news else "No"}</p>
<p style="margin: 0;"><b>Current Monthly Donor:</b> {"Yes" if monthly else "No"}</p>
<br>\
<p style="margin: 0;"><b>Distance from EHSC:</b> {dist:.2f}km</p>
<br>\
<p style="margin: 0;"><b>Email:</b><br> {email}</p>
</div>
'''
color = linear_colormap(total_don)
marker = folium.CircleMarker(
location=[lat, long],
radius=5,
color=color,
fill=True,
fill_color=color,
fill_opacity=0.7,
popup=popup_text
)
# Add the marker to the appropriate FeatureGroup
for fgroup, (lower, upper) in zip(fgroups, zip([0, 100, 250, 500, 750, 1000], [100, 250, 500, 750, 1000, float('inf')])):
if lower <= total_don < upper:
fgroup.add_child(marker)
break
# Add the FeatureGroups to the map
for fgroup in fgroups:
m.add_child(fgroup)
linear_colormap.add_to(m)
linear_colormap.caption = 'Total Donated (£)'
m.add_child(folium.LayerControl())
# Create a marker at EHSC
popup_html = '''<h4 style="margin-bottom: 5px;">Emmanuel House Support Centre</h4>
<a href="https://www.emmanuelhouse.org.uk/" target="_blank">https://www.emmanuelhouse.org.uk/</a>
<p>Emmanuel House is an independent charity that supports people who are homeless, rough sleeping, in crisis, or at risk of homelessness in Nottingham.</p>
'''
marker = folium.Marker(location=ehsc_coords, popup=folium.Popup(popup_html))
m.add_child(marker)
mMake this Notebook Trusted to load map: File -> Trust Notebook
Note the popups that appear when clicking on each datapoint in the above map!
There is a layer control menu hidden in the top right corner until mouseover. It lets you show and hide the points with donation totals in specific ranges.
REINTERPRETING AS A CLUSTER MAP
m = folium.Map(location=[52.9548, -1.1581], zoom_start=12)
colors = ['green', 'yellow', 'orange', 'red', 'purple']
linear_colormap = LinearColormap(colors=colors,
index=[0, 100, 250, 500, 1000],
vmin=df['Transactions_LifetimeGiftsAmount'].min(),
vmax=df['Transactions_LifetimeGiftsAmount'].quantile(0.94))
# Create a MarkerCluster
marker_cluster = MarkerCluster().add_to(m)
for index, row in df.iterrows():
fname = 'Example'
lname = 'Donor'
email = 'exampledonor@email.com'
total_don = row['Transactions_LifetimeGiftsAmount']
num_don = row['Transactions_LifetimeGiftsNumber']
avg_don = row['Transactions_AverageGiftAmount']
news = bool(row['Newsletter'])
monthly = bool(row['monthlyDonorMonths1to12'])
lat = row['Latitude']
long = row['Longitude']
dateoffirst = row['Transactions_DateOfFirstGift'].strftime('%d/%m/%Y')
dateoflast = row['Transactions_DateOfLastGift'].strftime('%d/%m/%Y')
active = row['activeMonths']
freq = row['DonationFrequency']
freq_active = row['DonationFrequencyActive']
dist = row['DistanceFromEHSC']
popup_text = f'''
<div style="width: 200px; font-family: Arial; line-height: 1.2;">
<h4 style="margin-bottom: 5px;">{fname} {lname}</h4>
<p style="margin: 0;"><b>Total Donated:</b> £{total_don:.2f}</p>
<p style="margin: 0;"><b>Number of Donations:</b> {num_don}</p>
<p style="margin: 0;"><b>Average Donation:</b> £{avg_don:.2f}</p>
<br>\
<p style="margin: 0;"><b>First Recorded Donation:</b> {dateoffirst}</p>
<p style="margin: 0;"><b>Last Recorded Donation:</b> {dateoflast}</p>
<br>\
<p style="margin: 0;"><b>ActiveMonths:</b> {active}</p>
<p style="margin: 0;"><b>DonationFrequency</b> {freq:.2f}</p>
<p style="margin: 0;"><b>DonationFrequencyActive</b> {freq_active:.2f}</p>
<br>\
<p style="margin: 0;"><b>Subscribed to Newsletter:</b> {"Yes" if news else "No"}</p>
<p style="margin: 0;"><b>Current Monthly Donor:</b> {"Yes" if monthly else "No"}</p>
<br>\
<p style="margin: 0;"><b>Distance from EHSC:</b> {dist:.2f}km</p>
<br>\
<p style="margin: 0;"><b>Email:</b><br> {email}</p>
</div>
'''
color = linear_colormap(total_don)
marker = folium.CircleMarker(
location=[lat, long],
radius=5,
color=color,
fill=True,
fill_color=color,
fill_opacity=0.7,
popup=popup_text
)
# Add the marker to the MarkerCluster
marker.add_to(marker_cluster)
linear_colormap.add_to(m)
linear_colormap.caption = 'Total Donated (£)'
m.add_child(folium.LayerControl())
# Create a marker at EHSC
popup_html = '''<h4 style="margin-bottom: 5px;">Emmanuel House Support Centre</h4>
<a href="https://www.emmanuelhouse.org.uk/" target="_blank">https://www.emmanuelhouse.org.uk/</a>
<p>Emmanuel House is an independent charity that supports people who are homeless, rough sleeping, in crisis, or at risk of homelessness in Nottingham.</p>
'''
marker = folium.Marker(location=ehsc_coords, popup=folium.Popup(popup_html))
m.add_child(marker)
mMake this Notebook Trusted to load map: File -> Trust Notebook
REMARKS
- The distribution of the fictional donors contained in
data.csvmore closely resembles the distribution of the real donors than the synthetic data constructed in my previous blog post:
Investigating The Geographic Distribution Of Charity Donors With Interactive Maps Made Using Folium - The Latitudes and Longitudes have been constructed to be the features following a distribution that is the least representative of the real data, for obvious privacy concerns. This is the reason for the donors on main roads, in Wollaton park etc.