Home Basic Data Analysis Create a Personal Portfolio/Wealth Simulation in Python (Part 2)

Create a Personal Portfolio/Wealth Simulation in Python (Part 2)

by s666

Welcome to Part 2 of the series of posts dealing with how to build your own python based personal portfolio /wealth simulation model. At the end of the first post (which can be found here), we got to the point where we had modelled some inflows, some outflows, we had applied an annual salary raise to our future income flows, along with applying various tax rates to both our active income (salary) and investment income.

We had also factored in a stochastic element to our model for generating investment returns, using the historic mean monthly return and volatility of the S&P Total Return index as a proxy for the “market”. Finally, we had ended by adding a couple of lines of code that would record whether our wealth/asset value were reduced to (or below) zero at any point throughout the simulated time period; we will be using this later to help calculate our “risk of ruin”, i.e. attach probabilities to the likelihood of ending up with assets worth less than a threshold value (one that we deem unacceptable to fall below – which doesn’t have to be zero of course).

The full code snippet from the end of Part 1 is shown below for convenience (I have altered some of the outflow and inflow values back to their original levels just fyi). IMPORTANT: I have also had to include the use of the yfinance package to allow the over-riding of the Pandas DataReader packaged as at the time of writing, Pandas DataReader is currently not stable and essentially not working. The code changes are commented below to help identify them more easily.

import pandas as pd
import numpy as np
import random

# change syntax and name of Pandas DataReader import
from pandas_datareader import data as pdr

# import yfinance as pandas data-reader currently not stable
import yfinance as yf

import datetime


import matplotlib as mpl
import matplotlib.pyplot as plt

%matplotlib inline

# set the numpy random seed to allow replication of results
np.random.seed(seed=7)

start, end = datetime.datetime(2000, 12, 31), datetime.datetime(2020, 1,1)


# add yfinance-pandas data-reader overide (now we can use pandas data-reader methods again as currently unstable)
yf.pdr_override()

# change our download to use new over-ridden method
sp = pdr.get_data_yahoo("^SP500TR", start=start, end=end)


sp_monthly_pct_return = sp.resample('M').last().pct_change().mean().values[0]
sp_monthly_std_dev = sp.resample('M').last().pct_change().std().values[0]


inflows = {'active_annual_income':50_000,
           'starting_assets': 250_000} 


outflows = {'rent':1500,
            'credit_card_payment':750,
            'medical_insurance':250,
            'pension_contribution':500,
            'misc': 1500}


variables = {'start_date' : "01/01/2020",
             'years': 10,
             'tax_on_active_income_gains': 0.25,
             'avg_ann_income_raise':0.05,
             'avg_ann_inflation': 0.02,
             'tax_on_investment_gains': 0.35,
             'avg_monthly_market_returns': sp_monthly_pct_return, 
             'avg_monthly_market_volatility': sp_monthly_std_dev} 

income_gains_storage = []
investment_gains_storage = [] 


assets_starting_list = [inflows['starting_assets']] 
assets_ending_list = [] 

months = variables['years'] * 12

ruined = False


for month in range(months):
    

    if assets_ending_list:
        assets_starting_list.append(assets_ending_list[-1])
        

    assets = assets_starting_list[-1]

    assets -= (outflows['rent'] + outflows['credit_card_payment'] + \
              outflows['medical_insurance'] + outflows['pension_contribution'] + \
              outflows['misc'])
    

    if assets <= 0:
        inv_gain = 0
        ruined = True
        break

    market_return = np.random.normal(variables['avg_monthly_market_returns'], 
                                     variables['avg_monthly_market_volatility'],
                                    1)[0]
    
    investment_return = (assets * market_return) * (1 - variables['tax_on_investment_gains'])
  
    investment_gains_storage.append(investment_return)
    

    assets += investment_return
    
    
    income = (inflows['active_annual_income'] * (1 - variables['tax_on_active_income_gains'])) / 12
    income_gains_storage.append(income)
    
    if (month % 12 == 0):
            inflows['active_annual_income'] *= (1 + (variables['avg_ann_income_raise']))

            outflows['rent'] *= (1 + (variables['avg_ann_inflation']))
            outflows['credit_card_payment'] *= (1 + (variables['avg_ann_inflation']))
            outflows['medical_insurance'] *= (1 + (variables['avg_ann_inflation']))
            outflows['pension_contribution'] *= (1 + (variables['avg_ann_inflation']))
            outflows['misc'] *= (1 + (variables['avg_ann_inflation']))
            
            

    assets += income

    assets_ending = assets

    assets_ending_list.append(assets_ending)  
    
   
plt.plot(pd.Series(assets_ending_list))
plt.xlabel('Month')
plt.ylabel('Ending Asset Value')
plt.show()

The next element we are going to incorporate is one that allows us to set a future date at which we plan to retire and effectively cease working, with the accompanying expectation that our active income (i.e. salary) will fall dramatically, if not disappear altogether. Retirement is an important stage of life to plan for, for many reasons; not only would our inflows from salary be affected, it is likely many of our outflows would change too. By retirement age, it is not uncommon for people to have paid off any outstanding mortgage debt on their homes. This means no more monthly mortgage outflows, and offers other opportunities such as equity release schemes, or “downsizing” of property etc.

Of utmost importance is to ensure that by the time you retire, you have build an asset base large enough to sustain your needs and generate enough investment income to cover your living costs, for how ever many years you end up living in retirement.

With this in mind let us start by adding a planned retirement date (entered as years from today until retirement; I have chosen 25 years in the future as that would place me around the right age). We shall also extend the simulation period from 10 years to 40. I have also added a couple of lines to check whether we are before or after our planned retirement date and adjust our monthly income (i.e. salary inflow) accordingly.

np.random.seed(seed=7)

start, end = datetime.datetime(2000, 12, 31), datetime.datetime(2020, 1,1)

yf.pdr_override()

sp = pdr.get_data_yahoo("^SP500TR", start=start, end=end)


sp_monthly_pct_return = sp.resample('M').last().pct_change().mean().values[0]
sp_monthly_std_dev = sp.resample('M').last().pct_change().std().values[0]


inflows = {'active_annual_income':50_000,
           'starting_assets': 250_000} 


outflows = {'rent':1500,
            'credit_card_payment':750,
            'medical_insurance':250,
            'pension_contribution':500,
            'misc': 1500}


variables = {'start_date' : "01/01/2020",
             'years': 40, 
             'retirement_year': 25,
             'tax_on_active_income_gains': 0.25,
             'avg_ann_income_raise':0.05,
             'avg_ann_inflation': 0.02,
             'tax_on_investment_gains': 0.35,
             'avg_monthly_market_returns': sp_monthly_pct_return, 
             'avg_monthly_market_volatility': sp_monthly_std_dev} 

income_gains_storage = []
investment_gains_storage = [] 


assets_starting_list = [inflows['starting_assets']] 
assets_ending_list = [] 

months = variables['years'] * 12


ruined = False


for month in range(months):
    

    if assets_ending_list:
        assets_starting_list.append(assets_ending_list[-1])
        

    assets = assets_starting_list[-1]

    assets -= (outflows['rent'] + outflows['credit_card_payment'] + \
              outflows['medical_insurance'] + outflows['pension_contribution'] + \
              outflows['misc'])
    

    if assets <= 0:
        inv_gain = 0
        ruined = True
        break

    market_return = np.random.normal(variables['avg_monthly_market_returns'], 
                                     variables['avg_monthly_market_volatility'],
                                    1)[0]
    
    investment_return = (assets * market_return) * (1 - variables['tax_on_investment_gains'])
  
    investment_gains_storage.append(investment_return)
    

    assets += investment_return
    
    
    # add logic to account for zero salary income once retired
    if month >= variables['retirement_year'] * 12:
        income = 0
    else:
        income = (inflows['active_annual_income'] * (1 - variables['tax_on_active_income_gains'])) / 12
    income_gains_storage.append(income)
    
    if (month % 12 == 0):
            inflows['active_annual_income'] *= (1 + (variables['avg_ann_income_raise']))

            outflows['rent'] *= (1 + (variables['avg_ann_inflation']))
            outflows['credit_card_payment'] *= (1 + (variables['avg_ann_inflation']))
            outflows['medical_insurance'] *= (1 + (variables['avg_ann_inflation']))
            outflows['pension_contribution'] *= (1 + (variables['avg_ann_inflation']))
            outflows['misc'] *= (1 + (variables['avg_ann_inflation']))
            
            

    assets += income

    assets_ending = assets

    assets_ending_list.append(assets_ending)  
    
   
plt.plot(pd.Series(assets_ending_list))
plt.xlabel('Month')
plt.ylabel('Ending Asset Value')
plt.show()

Now we can clearly see the effect our drop in salary after retirement has on our ending asset values through time. As is perfectly logical, with no salary coming in the tendency is for our personal wealth and asset base to fall over time. Firstly, we still need to account for the fact that both our post-retirement inflows and outflows are usually fundamentally different from the “normal”, non-retirement situation, and we shall come to that in just a second.

Let’s assume post-retirement we no longer pay “rent” or mortgage payments, so the current rent outflow of 1,500 no longer applies once we hit our retirement date. We could also imagine our health insurance costs and general medical expenses would rise significantly on average, and we might start to receive some pension income each month. The exact situation will vary for each individual, so for now I shall apply the above logic and go from there.

np.random.seed(seed=7)

start, end = datetime.datetime(2000, 12, 31), datetime.datetime(2020, 1,1)

yf.pdr_override()

sp = pdr.get_data_yahoo("^SP500TR", start=start, end=end)


sp_monthly_pct_return = sp.resample('M').last().pct_change().mean().values[0]
sp_monthly_std_dev = sp.resample('M').last().pct_change().std().values[0]


inflows = {'active_annual_income':50_000,
           'starting_assets': 250_000,
           'monthly_pension': 1500} # add estimated post-retirement monthly pension


outflows = {'rent':1500,
            'credit_card_payment':750,
            'medical_insurance':250,
            'pension_contribution':500,
            'misc': 1500,
            'retirement_medical_expenses':850, # add post-retirement monthly medical expenses
            'retirement_misc': 2000} # add post-retirement misc costs


variables = {'start_date' : "01/01/2020",
             'years': 40, 
             'retirement_year': 25, 
             'tax_on_active_income_gains': 0.25,
             'avg_ann_income_raise':0.05,
             'avg_ann_inflation': 0.02,
             'tax_on_investment_gains': 0.35,
             'avg_monthly_market_returns': sp_monthly_pct_return, 
             'avg_monthly_market_volatility': sp_monthly_std_dev} 

income_gains_storage = []
investment_gains_storage = [] 


assets_starting_list = [inflows['starting_assets']] 
assets_ending_list = [] 

months = variables['years'] * 12


ruined = False


for month in range(months):
    

    if assets_ending_list:
        assets_starting_list.append(assets_ending_list[-1])
        

    assets = assets_starting_list[-1]
    
    # add logic to account for different outflows in retirement
    if month >= variables['retirement_year'] * 12:
        outflow = outflows['retirement_medical_expenses'] + outflows['retirement_misc']
    
    else:
        
        outflow = (outflows['rent'] + outflows['credit_card_payment'] + \
              outflows['medical_insurance'] + outflows['pension_contribution'] + \
              outflows['misc'])
    
    assets -= outflow
    

    if assets <= 0:
        inv_gain = 0
        ruined = True
        break

    market_return = np.random.normal(variables['avg_monthly_market_returns'], 
                                     variables['avg_monthly_market_volatility'],
                                    1)[0]
    
    investment_return = (assets * market_return) * (1 - variables['tax_on_investment_gains'])
  
    investment_gains_storage.append(investment_return)
    

    assets += investment_return
    
    
    # add logic to account for pension received in retirement with no salary
    if month >= variables['retirement_year'] * 12:
        income = inflows['monthly_pension']
    else:
        income = (inflows['active_annual_income'] * (1 - variables['tax_on_active_income_gains'])) / 12
    
    income_gains_storage.append(income)
    
    if (month % 12 == 0):
            inflows['active_annual_income'] *= (1 + (variables['avg_ann_income_raise']))

            outflows['rent'] *= (1 + (variables['avg_ann_inflation']))
            outflows['credit_card_payment'] *= (1 + (variables['avg_ann_inflation']))
            outflows['medical_insurance'] *= (1 + (variables['avg_ann_inflation']))
            outflows['pension_contribution'] *= (1 + (variables['avg_ann_inflation']))
            outflows['misc'] *= (1 + (variables['avg_ann_inflation']))
            
            

    assets += income

    assets_ending = assets

    assets_ending_list.append(assets_ending)  
    
   
plt.plot(pd.Series(assets_ending_list))
plt.xlabel('Month')
plt.ylabel('Ending Asset Value')
plt.show()

Ok the above code implements the logic we have discussed thus far. It is, however, getting somewhat messy and could really do with being cleaned up. There are a few parts of the script that could be wrapped up as functions – the two parts that spring to mind are those for calculating our inflows and outflows each period. Let’s start with the outflows:

def calculate_outflows(month, outflows, variables):
    
    # logic to account for different outflows in retirement
    if month >= variables['retirement_year'] * 12:
        outflow = outflows['retirement_medical_expenses'] + outflows['retirement_misc']
    
    else:
        # logic to account for different outflows before retirement
        outflow = (outflows['rent'] + outflows['credit_card_payment'] + \
              outflows['medical_insurance'] + outflows['pension_contribution'] + \
              outflows['misc'])
    
    # each year, increment the values according to the annual inflation input 
    if (month % 12 == 0) and (month > 0):

        outflows['rent'] *= (1 + (variables['avg_ann_inflation']))
        outflows['credit_card_payment'] *= (1 + (variables['avg_ann_inflation']))
        outflows['medical_insurance'] *= (1 + (variables['avg_ann_inflation']))
        outflows['pension_contribution'] *= (1 + (variables['avg_ann_inflation']))
        outflows['misc'] *= (1 + (variables['avg_ann_inflation']))
        
    return outflow

And now our income (not including our investment income – we will create a separate function for that).

def calculate_income(month, inflows, variables):
    
    # logic to account for different inflows in retirement
    if month >= variables['retirement_year'] * 12:
        income = inflows['monthly_pension']
    
    else: 
        # logic to account for different inflows before retirement
        income = (inflows['active_annual_income'] * (1 - variables['tax_on_active_income_gains'])) / 12
        if (month % 12 == 0) and (month > 0):
            inflows['active_annual_income'] *= (1 + (variables['avg_ann_income_raise']))
            
    return income

And now to incorporate it into our main script:

np.random.seed(seed=7)

start, end = datetime.datetime(2000, 12, 31), datetime.datetime(2020, 1,1)

yf.pdr_override()

sp = pdr.get_data_yahoo("^SP500TR", start=start, end=end)


sp_monthly_pct_return = sp.resample('M').last().pct_change().mean().values[0]
sp_monthly_std_dev = sp.resample('M').last().pct_change().std().values[0]


inflows = {'active_annual_income':50_000,
           'starting_assets': 250_000,
           'monthly_pension': 1500} 


outflows = {'rent':1500,
            'credit_card_payment':750,
            'medical_insurance':250,
            'pension_contribution':500,
            'misc': 1500,
            'retirement_medical_expenses':850, 
            'retirement_misc': 2000} 


variables = {'start_date' : "01/01/2020",
             'years': 40, 
             'retirement_year': 25, 
             'tax_on_active_income_gains': 0.25,
             'avg_ann_income_raise':0.05,
             'avg_ann_inflation': 0.02,
             'tax_on_investment_gains': 0.35,
             'avg_monthly_market_returns': sp_monthly_pct_return, 
             'avg_monthly_market_volatility': sp_monthly_std_dev} 

income_gains_storage = []
investment_gains_storage = [] 


assets_starting_list = [inflows['starting_assets']] 
assets_ending_list = [] 

months = variables['years'] * 12


ruined = False


for month in range(months):
    

    if assets_ending_list:
        assets_starting_list.append(assets_ending_list[-1])
        

    assets = assets_starting_list[-1]
    
    # calculate outflows using created function
    outflow = calculate_outflows(month, outflows, variables)
    
    assets -= outflow
    

    if assets <= 0:
        inv_gain = 0
        ruined = True
        break

    market_return = np.random.normal(variables['avg_monthly_market_returns'], 
                                     variables['avg_monthly_market_volatility'],
                                    1)[0]
    
    investment_return = (assets * market_return) * (1 - variables['tax_on_investment_gains'])
  
    investment_gains_storage.append(investment_return)
    

    assets += investment_return
    
    # calculate outflows using created function
    income = calculate_income(month, inflows, variables)

    
    income_gains_storage.append(income)
    
    # remove the code that was previously here regarding the annual incrementing of values 

    assets += income

    assets_ending = assets

    assets_ending_list.append(assets_ending)  
    
   
plt.plot(pd.Series(assets_ending_list))
plt.xlabel('Month')
plt.ylabel('Ending Asset Value')
plt.show()

We can see from the plot above that the results are the same as those we generated before we added the use of functions, which bodes well and suggests the functions have been correctly structured and applied. We might also want to take the code used to calculate the monthly investment income and warp that into a function. This will allow us to more easily add complexity to the logic if necessary.

def calculate_investment_gains(assets, variables):

    if assets <= 0:
        inv_gains = 0
    
    else:
        market_return = np.random.normal(variables['avg_monthly_market_returns'], 
                                         variables['avg_monthly_market_volatility'],
                                        1)[0]

        inv_gains = assets * market_return

    return inv_gains

And now to add the new investment gains function to our script:

np.random.seed(seed=7)

start, end = datetime.datetime(2000, 12, 31), datetime.datetime(2020, 1,1)

yf.pdr_override()

sp = pdr.get_data_yahoo("^SP500TR", start=start, end=end)


sp_monthly_pct_return = sp.resample('M').last().pct_change().mean().values[0]
sp_monthly_std_dev = sp.resample('M').last().pct_change().std().values[0]


inflows = {'active_annual_income':50_000,
           'starting_assets': 250_000,
           'monthly_pension': 1500} 


outflows = {'rent':1500,
            'credit_card_payment':750,
            'medical_insurance':250,
            'pension_contribution':500,
            'misc': 1500,
            'retirement_medical_expenses':850, 
            'retirement_misc': 2000} 


variables = {'start_date' : "01/01/2020",
             'years': 40, 
             'retirement_year': 25, 
             'tax_on_active_income_gains': 0.25,
             'avg_ann_income_raise':0.05,
             'avg_ann_inflation': 0.02,
             'tax_on_investment_gains': 0.35,
             'avg_monthly_market_returns': sp_monthly_pct_return, 
             'avg_monthly_market_volatility': sp_monthly_std_dev} 

income_gains_storage = []
investment_gains_storage = [] 


assets_starting_list = [inflows['starting_assets']] 
assets_ending_list = [] 

months = variables['years'] * 12


ruined = False


for month in range(months):

    if assets_ending_list:
        assets_starting_list.append(assets_ending_list[-1])
        
    assets = assets_starting_list[-1]
    
    outflow = calculate_outflows(month, outflows, variables)
    
    assets -= outflow
    
    # altered the ``if`` code block below to account for use of new function
    if assets <= 0:
        ruined = True
        break
    
    # use our new function to calculate investment returns
    investment_return = calculate_investment_gains(assets, variables)
  
    investment_gains_storage.append(investment_return)

    assets += investment_return
    
    income = calculate_income(month, inflows, variables)

    income_gains_storage.append(income)
    
    assets += income

    assets_ending = assets

    assets_ending_list.append(assets_ending)  
    
   
plt.plot(pd.Series(assets_ending_list))
plt.xlabel('Month')
plt.ylabel('Ending Asset Value')
plt.show()

Just FYI the way we have moved the calculation of monthly returns inside a function and haven’t included the setting of the numpy seed value within that function means that now, each time we run it, we will get slightly different results. I didn’t include the setting of the random seed as if I had, each month’s market return output would have been exactly the same which is of course not what we want in this case.

Now that we have cleaned the code up a bit and started to refactor some of the logic into separate functions I am going to leave it here for this pos. There is still plenty to do before we are finished and we will pick it back up in Part 3.

You may also like

Leave a Reply