Home Basic Data Analysis Create a Personal Portfolio/Wealth Simulation in Python (Part 2)

# Create a Personal Portfolio/Wealth Simulation in Python (Part 2)

Welcome to Part 2 of the series of posts dealing with how to build your own python based personal portfolio /wealth simulation model. At the end of the first post (which can be found here), we got to the point where we had modelled some inflows, some outflows, we had applied an annual salary raise to our future income flows, along with applying various tax rates to both our active income (salary) and investment income.

We had also factored in a stochastic element to our model for generating investment returns, using the historic mean monthly return and volatility of the S&P Total Return index as a proxy for the “market”. Finally, we had ended by adding a couple of lines of code that would record whether our wealth/asset value were reduced to (or below) zero at any point throughout the simulated time period; we will be using this later to help calculate our “risk of ruin”, i.e. attach probabilities to the likelihood of ending up with assets worth less than a threshold value (one that we deem unacceptable to fall below – which doesn’t have to be zero of course).

The full code snippet from the end of Part 1 is shown below for convenience (I have altered some of the outflow and inflow values back to their original levels just fyi). IMPORTANT: I have also had to include the use of the yfinance package to allow the over-riding of the Pandas DataReader packaged as at the time of writing, Pandas DataReader is currently not stable and essentially not working. The code changes are commented below to help identify them more easily.

```import pandas as pd
import numpy as np
import random

# change syntax and name of Pandas DataReader import
from pandas_datareader import data as pdr

# import yfinance as pandas data-reader currently not stable
import yfinance as yf

import datetime

import matplotlib as mpl
import matplotlib.pyplot as plt

%matplotlib inline

# set the numpy random seed to allow replication of results
np.random.seed(seed=7)

start, end = datetime.datetime(2000, 12, 31), datetime.datetime(2020, 1,1)

yf.pdr_override()

sp = pdr.get_data_yahoo("^SP500TR", start=start, end=end)

sp_monthly_pct_return = sp.resample('M').last().pct_change().mean().values
sp_monthly_std_dev = sp.resample('M').last().pct_change().std().values

inflows = {'active_annual_income':50_000,
'starting_assets': 250_000}

outflows = {'rent':1500,
'credit_card_payment':750,
'medical_insurance':250,
'pension_contribution':500,
'misc': 1500}

variables = {'start_date' : "01/01/2020",
'years': 10,
'tax_on_active_income_gains': 0.25,
'avg_ann_income_raise':0.05,
'avg_ann_inflation': 0.02,
'tax_on_investment_gains': 0.35,
'avg_monthly_market_returns': sp_monthly_pct_return,
'avg_monthly_market_volatility': sp_monthly_std_dev}

income_gains_storage = []
investment_gains_storage = []

assets_starting_list = [inflows['starting_assets']]
assets_ending_list = []

months = variables['years'] * 12

ruined = False

for month in range(months):

if assets_ending_list:
assets_starting_list.append(assets_ending_list[-1])

assets = assets_starting_list[-1]

assets -= (outflows['rent'] + outflows['credit_card_payment'] + \
outflows['medical_insurance'] + outflows['pension_contribution'] + \
outflows['misc'])

if assets <= 0:
inv_gain = 0
ruined = True
break

market_return = np.random.normal(variables['avg_monthly_market_returns'],
variables['avg_monthly_market_volatility'],
1)

investment_return = (assets * market_return) * (1 - variables['tax_on_investment_gains'])

investment_gains_storage.append(investment_return)

assets += investment_return

income = (inflows['active_annual_income'] * (1 - variables['tax_on_active_income_gains'])) / 12
income_gains_storage.append(income)

if (month % 12 == 0):
inflows['active_annual_income'] *= (1 + (variables['avg_ann_income_raise']))

outflows['rent'] *= (1 + (variables['avg_ann_inflation']))
outflows['credit_card_payment'] *= (1 + (variables['avg_ann_inflation']))
outflows['medical_insurance'] *= (1 + (variables['avg_ann_inflation']))
outflows['pension_contribution'] *= (1 + (variables['avg_ann_inflation']))
outflows['misc'] *= (1 + (variables['avg_ann_inflation']))

assets += income

assets_ending = assets

assets_ending_list.append(assets_ending)

plt.plot(pd.Series(assets_ending_list))
plt.xlabel('Month')
plt.ylabel('Ending Asset Value')
plt.show()```

The next element we are going to incorporate is one that allows us to set a future date at which we plan to retire and effectively cease working, with the accompanying expectation that our active income (i.e. salary) will fall dramatically, if not disappear altogether. Retirement is an important stage of life to plan for, for many reasons; not only would our inflows from salary be affected, it is likely many of our outflows would change too. By retirement age, it is not uncommon for people to have paid off any outstanding mortgage debt on their homes. This means no more monthly mortgage outflows, and offers other opportunities such as equity release schemes, or “downsizing” of property etc.

Of utmost importance is to ensure that by the time you retire, you have build an asset base large enough to sustain your needs and generate enough investment income to cover your living costs, for how ever many years you end up living in retirement.

With this in mind let us start by adding a planned retirement date (entered as years from today until retirement; I have chosen 25 years in the future as that would place me around the right age). We shall also extend the simulation period from 10 years to 40. I have also added a couple of lines to check whether we are before or after our planned retirement date and adjust our monthly income (i.e. salary inflow) accordingly.

```np.random.seed(seed=7)

start, end = datetime.datetime(2000, 12, 31), datetime.datetime(2020, 1,1)

yf.pdr_override()

sp = pdr.get_data_yahoo("^SP500TR", start=start, end=end)

sp_monthly_pct_return = sp.resample('M').last().pct_change().mean().values
sp_monthly_std_dev = sp.resample('M').last().pct_change().std().values

inflows = {'active_annual_income':50_000,
'starting_assets': 250_000}

outflows = {'rent':1500,
'credit_card_payment':750,
'medical_insurance':250,
'pension_contribution':500,
'misc': 1500}

variables = {'start_date' : "01/01/2020",
'years': 40,
'retirement_year': 25,
'tax_on_active_income_gains': 0.25,
'avg_ann_income_raise':0.05,
'avg_ann_inflation': 0.02,
'tax_on_investment_gains': 0.35,
'avg_monthly_market_returns': sp_monthly_pct_return,
'avg_monthly_market_volatility': sp_monthly_std_dev}

income_gains_storage = []
investment_gains_storage = []

assets_starting_list = [inflows['starting_assets']]
assets_ending_list = []

months = variables['years'] * 12

ruined = False

for month in range(months):

if assets_ending_list:
assets_starting_list.append(assets_ending_list[-1])

assets = assets_starting_list[-1]

assets -= (outflows['rent'] + outflows['credit_card_payment'] + \
outflows['medical_insurance'] + outflows['pension_contribution'] + \
outflows['misc'])

if assets <= 0:
inv_gain = 0
ruined = True
break

market_return = np.random.normal(variables['avg_monthly_market_returns'],
variables['avg_monthly_market_volatility'],
1)

investment_return = (assets * market_return) * (1 - variables['tax_on_investment_gains'])

investment_gains_storage.append(investment_return)

assets += investment_return

# add logic to account for zero salary income once retired
if month >= variables['retirement_year'] * 12:
income = 0
else:
income = (inflows['active_annual_income'] * (1 - variables['tax_on_active_income_gains'])) / 12
income_gains_storage.append(income)

if (month % 12 == 0):
inflows['active_annual_income'] *= (1 + (variables['avg_ann_income_raise']))

outflows['rent'] *= (1 + (variables['avg_ann_inflation']))
outflows['credit_card_payment'] *= (1 + (variables['avg_ann_inflation']))
outflows['medical_insurance'] *= (1 + (variables['avg_ann_inflation']))
outflows['pension_contribution'] *= (1 + (variables['avg_ann_inflation']))
outflows['misc'] *= (1 + (variables['avg_ann_inflation']))

assets += income

assets_ending = assets

assets_ending_list.append(assets_ending)

plt.plot(pd.Series(assets_ending_list))
plt.xlabel('Month')
plt.ylabel('Ending Asset Value')
plt.show()```

Now we can clearly see the effect our drop in salary after retirement has on our ending asset values through time. As is perfectly logical, with no salary coming in the tendency is for our personal wealth and asset base to fall over time. Firstly, we still need to account for the fact that both our post-retirement inflows and outflows are usually fundamentally different from the “normal”, non-retirement situation, and we shall come to that in just a second.

Let’s assume post-retirement we no longer pay “rent” or mortgage payments, so the current rent outflow of 1,500 no longer applies once we hit our retirement date. We could also imagine our health insurance costs and general medical expenses would rise significantly on average, and we might start to receive some pension income each month. The exact situation will vary for each individual, so for now I shall apply the above logic and go from there.

```np.random.seed(seed=7)

start, end = datetime.datetime(2000, 12, 31), datetime.datetime(2020, 1,1)

yf.pdr_override()

sp = pdr.get_data_yahoo("^SP500TR", start=start, end=end)

sp_monthly_pct_return = sp.resample('M').last().pct_change().mean().values
sp_monthly_std_dev = sp.resample('M').last().pct_change().std().values

inflows = {'active_annual_income':50_000,
'starting_assets': 250_000,
'monthly_pension': 1500} # add estimated post-retirement monthly pension

outflows = {'rent':1500,
'credit_card_payment':750,
'medical_insurance':250,
'pension_contribution':500,
'misc': 1500,
'retirement_medical_expenses':850, # add post-retirement monthly medical expenses
'retirement_misc': 2000} # add post-retirement misc costs

variables = {'start_date' : "01/01/2020",
'years': 40,
'retirement_year': 25,
'tax_on_active_income_gains': 0.25,
'avg_ann_income_raise':0.05,
'avg_ann_inflation': 0.02,
'tax_on_investment_gains': 0.35,
'avg_monthly_market_returns': sp_monthly_pct_return,
'avg_monthly_market_volatility': sp_monthly_std_dev}

income_gains_storage = []
investment_gains_storage = []

assets_starting_list = [inflows['starting_assets']]
assets_ending_list = []

months = variables['years'] * 12

ruined = False

for month in range(months):

if assets_ending_list:
assets_starting_list.append(assets_ending_list[-1])

assets = assets_starting_list[-1]

# add logic to account for different outflows in retirement
if month >= variables['retirement_year'] * 12:
outflow = outflows['retirement_medical_expenses'] + outflows['retirement_misc']

else:

outflow = (outflows['rent'] + outflows['credit_card_payment'] + \
outflows['medical_insurance'] + outflows['pension_contribution'] + \
outflows['misc'])

assets -= outflow

if assets <= 0:
inv_gain = 0
ruined = True
break

market_return = np.random.normal(variables['avg_monthly_market_returns'],
variables['avg_monthly_market_volatility'],
1)

investment_return = (assets * market_return) * (1 - variables['tax_on_investment_gains'])

investment_gains_storage.append(investment_return)

assets += investment_return

# add logic to account for pension received in retirement with no salary
if month >= variables['retirement_year'] * 12:
income = inflows['monthly_pension']
else:
income = (inflows['active_annual_income'] * (1 - variables['tax_on_active_income_gains'])) / 12

income_gains_storage.append(income)

if (month % 12 == 0):
inflows['active_annual_income'] *= (1 + (variables['avg_ann_income_raise']))

outflows['rent'] *= (1 + (variables['avg_ann_inflation']))
outflows['credit_card_payment'] *= (1 + (variables['avg_ann_inflation']))
outflows['medical_insurance'] *= (1 + (variables['avg_ann_inflation']))
outflows['pension_contribution'] *= (1 + (variables['avg_ann_inflation']))
outflows['misc'] *= (1 + (variables['avg_ann_inflation']))

assets += income

assets_ending = assets

assets_ending_list.append(assets_ending)

plt.plot(pd.Series(assets_ending_list))
plt.xlabel('Month')
plt.ylabel('Ending Asset Value')
plt.show()```

Ok the above code implements the logic we have discussed thus far. It is, however, getting somewhat messy and could really do with being cleaned up. There are a few parts of the script that could be wrapped up as functions – the two parts that spring to mind are those for calculating our inflows and outflows each period. Let’s start with the outflows:

```def calculate_outflows(month, outflows, variables):

# logic to account for different outflows in retirement
if month >= variables['retirement_year'] * 12:
outflow = outflows['retirement_medical_expenses'] + outflows['retirement_misc']

else:
# logic to account for different outflows before retirement
outflow = (outflows['rent'] + outflows['credit_card_payment'] + \
outflows['medical_insurance'] + outflows['pension_contribution'] + \
outflows['misc'])

# each year, increment the values according to the annual inflation input
if (month % 12 == 0) and (month > 0):

outflows['rent'] *= (1 + (variables['avg_ann_inflation']))
outflows['credit_card_payment'] *= (1 + (variables['avg_ann_inflation']))
outflows['medical_insurance'] *= (1 + (variables['avg_ann_inflation']))
outflows['pension_contribution'] *= (1 + (variables['avg_ann_inflation']))
outflows['misc'] *= (1 + (variables['avg_ann_inflation']))

return outflow```

And now our income (not including our investment income – we will create a separate function for that).

```def calculate_income(month, inflows, variables):

# logic to account for different inflows in retirement
if month >= variables['retirement_year'] * 12:
income = inflows['monthly_pension']

else:
# logic to account for different inflows before retirement
income = (inflows['active_annual_income'] * (1 - variables['tax_on_active_income_gains'])) / 12
if (month % 12 == 0) and (month > 0):
inflows['active_annual_income'] *= (1 + (variables['avg_ann_income_raise']))

return income```

And now to incorporate it into our main script:

```np.random.seed(seed=7)

start, end = datetime.datetime(2000, 12, 31), datetime.datetime(2020, 1,1)

yf.pdr_override()

sp = pdr.get_data_yahoo("^SP500TR", start=start, end=end)

sp_monthly_pct_return = sp.resample('M').last().pct_change().mean().values
sp_monthly_std_dev = sp.resample('M').last().pct_change().std().values

inflows = {'active_annual_income':50_000,
'starting_assets': 250_000,
'monthly_pension': 1500}

outflows = {'rent':1500,
'credit_card_payment':750,
'medical_insurance':250,
'pension_contribution':500,
'misc': 1500,
'retirement_medical_expenses':850,
'retirement_misc': 2000}

variables = {'start_date' : "01/01/2020",
'years': 40,
'retirement_year': 25,
'tax_on_active_income_gains': 0.25,
'avg_ann_income_raise':0.05,
'avg_ann_inflation': 0.02,
'tax_on_investment_gains': 0.35,
'avg_monthly_market_returns': sp_monthly_pct_return,
'avg_monthly_market_volatility': sp_monthly_std_dev}

income_gains_storage = []
investment_gains_storage = []

assets_starting_list = [inflows['starting_assets']]
assets_ending_list = []

months = variables['years'] * 12

ruined = False

for month in range(months):

if assets_ending_list:
assets_starting_list.append(assets_ending_list[-1])

assets = assets_starting_list[-1]

# calculate outflows using created function
outflow = calculate_outflows(month, outflows, variables)

assets -= outflow

if assets <= 0:
inv_gain = 0
ruined = True
break

market_return = np.random.normal(variables['avg_monthly_market_returns'],
variables['avg_monthly_market_volatility'],
1)

investment_return = (assets * market_return) * (1 - variables['tax_on_investment_gains'])

investment_gains_storage.append(investment_return)

assets += investment_return

# calculate outflows using created function
income = calculate_income(month, inflows, variables)

income_gains_storage.append(income)

# remove the code that was previously here regarding the annual incrementing of values

assets += income

assets_ending = assets

assets_ending_list.append(assets_ending)

plt.plot(pd.Series(assets_ending_list))
plt.xlabel('Month')
plt.ylabel('Ending Asset Value')
plt.show()```

We can see from the plot above that the results are the same as those we generated before we added the use of functions, which bodes well and suggests the functions have been correctly structured and applied. We might also want to take the code used to calculate the monthly investment income and warp that into a function. This will allow us to more easily add complexity to the logic if necessary.

```def calculate_investment_gains(assets, variables):

if assets <= 0:
inv_gains = 0

else:
market_return = np.random.normal(variables['avg_monthly_market_returns'],
variables['avg_monthly_market_volatility'],
1)

inv_gains = assets * market_return

return inv_gains```

And now to add the new investment gains function to our script:

```np.random.seed(seed=7)

start, end = datetime.datetime(2000, 12, 31), datetime.datetime(2020, 1,1)

yf.pdr_override()

sp = pdr.get_data_yahoo("^SP500TR", start=start, end=end)

sp_monthly_pct_return = sp.resample('M').last().pct_change().mean().values
sp_monthly_std_dev = sp.resample('M').last().pct_change().std().values

inflows = {'active_annual_income':50_000,
'starting_assets': 250_000,
'monthly_pension': 1500}

outflows = {'rent':1500,
'credit_card_payment':750,
'medical_insurance':250,
'pension_contribution':500,
'misc': 1500,
'retirement_medical_expenses':850,
'retirement_misc': 2000}

variables = {'start_date' : "01/01/2020",
'years': 40,
'retirement_year': 25,
'tax_on_active_income_gains': 0.25,
'avg_ann_income_raise':0.05,
'avg_ann_inflation': 0.02,
'tax_on_investment_gains': 0.35,
'avg_monthly_market_returns': sp_monthly_pct_return,
'avg_monthly_market_volatility': sp_monthly_std_dev}

income_gains_storage = []
investment_gains_storage = []

assets_starting_list = [inflows['starting_assets']]
assets_ending_list = []

months = variables['years'] * 12

ruined = False

for month in range(months):

if assets_ending_list:
assets_starting_list.append(assets_ending_list[-1])

assets = assets_starting_list[-1]

outflow = calculate_outflows(month, outflows, variables)

assets -= outflow

# altered the ``if`` code block below to account for use of new function
if assets <= 0:
ruined = True
break

# use our new function to calculate investment returns
investment_return = calculate_investment_gains(assets, variables)

investment_gains_storage.append(investment_return)

assets += investment_return

income = calculate_income(month, inflows, variables)

income_gains_storage.append(income)

assets += income

assets_ending = assets

assets_ending_list.append(assets_ending)

plt.plot(pd.Series(assets_ending_list))
plt.xlabel('Month')
plt.ylabel('Ending Asset Value')
plt.show()```

Just FYI the way we have moved the calculation of monthly returns inside a function and haven’t included the setting of the numpy seed value within that function means that now, each time we run it, we will get slightly different results. I didn’t include the setting of the random seed as if I had, each month’s market return output would have been exactly the same which is of course not what we want in this case.

Now that we have cleaned the code up a bit and started to refactor some of the logic into separate functions I am going to leave it here for this pos. There is still plenty to do before we are finished and we will pick it back up in Part 3.