Analytica in the Classroom
Teaching Quantitative Modeling Skills
Teaching your students to think analytically and clearly about messy real-life decision problems is a key challenge for a professor in science, engineering and business. Appropriate use of software modeling tools can greatly enhance the learning experience as long as you as an instructor keep the emphasis on developing these skills rather than the mechanics of using the software. The use of Excel in the classroom, in particular, is often problematic in this respect. We often see the emphasis, and the focus of the students, shift from shift from developing the clear thinking and modeling skills to mastery of the spreadsheet mechanics. We believe that the use of Analytica in the classroom really helps to keep the focus on the core modeling skills.
For those of us who have been working in quantitative fields for many years, it is easy to lose sight of some of the most difficult, yet most important, skills that students must learn. These include:
- Ability to clearly define variables.
- Real-world problems are messy, quantitative, subjective, fuzzy, etc. Yet a formal analytical model must reduce these to unambiguous and quantitative terms -- what we usually refer to as variables.
- Identifying how the parameters and variables of a problem influence each other.
- Influence isn't always as straight forward as it seems. It often involves notions of causation, but in a decision model, even more fundamental is is how the information flow should proceed, which may not always be strictly causal. Identifying influences may involve abstract reasoning about how things can be computed. It also involves reasoning about what variables could reasonably be obtained or assessed, versus computed (decomposed into) other variables.
- Finding information and assessments
- Finding facts and assessments in publications, on the web, by contacting subject experts, etc., is a key skill of any model builder, plus greatly impacts the first two steps in terms of how models can be decomposed.
- Assessing unknowns
- Models always contain estimates, guesses, back-of-the-envelope assessment. Analytica never find everything fact they need already published, and must fill in unknowns, which gets to the next point.
- Expressing uncertainty
- The importance of explicit representations of uncertainty are now widely recognized across nearly all the quantitative fields of study. Expressing assessments in terms of a distribution, rather than single numbers, can actually speed up model building, plus lead to much deeper insight.
- Working with others
- Model builders must synthesize knowledge from other domain experts.
- Combining quantitative findings from models with unmodeled factors
- If you aren't careful, your students may think quantitative analysis will spit out the "correct answer". Since models only capture some of the world, we always have to take insights gained from analyses and combine them with unmodeled considerations, as well as contradictory results. A good curriculum exposes students to exercises along these lines.
- Presenting findings
- Analytical results are about insights gained from models, not about specific numbers. Developing clear and transparent analyses (and clear and transparent models) is absolute a key skill.
Ideas For Assignments
Immigration Policy and Population Growth
Have the students build a model of population growth in your country, including the impact of immigration policy. Without any immigration, what would the population of your country be in 2050? How about at current immigration levels? What if you let twice the number of people in? Remember, immigrants have babies once they are here as well.
Many factors figure into these models -- the age of immigrants (are they of child-bearing age?), illegal (which is harder to control) vs. legal (which can be legislated), enforcement of illegal immigration policies, etc. Students should be expected to consider a variety of scenarios and show the impacts.
Students will need to identify key variables -- such the current number of immigrants that currently migrate to this country (perhaps). They will be expected to look up facts to, to assess unknown quantities, to resolve conflicting estimates, and ideally, to include probability distributions on assessments.
Emphasize that the analysts role here is to project population growth, not to make judgements on social issues. The analysis should serve to help understand the problem in a larger context, providing vital insight for policy makers as to what specific immigration levels actually mean to future national infrastructure. Again, this is one piece in a bigger picture.
To launch the assignment, spent a part of a lecture discussing what some of the key variables are. It is a topic that many students can participate in. There will be more ideas floated than any one person can reasonably incorporate in their own model.
You'll find a large variation in the final results from models produced by different students. Having some oral presentations of the models to the class allow students to see the variation in results, and encourages them to constructively critique the assumptions made by their peers. Encourage students to show their Analytica models directly on the projector during their oral presentation.
Taxation of Marijuana
California leads the nation in agricultural production. Yet, it is often claimed that the largest cash crop produced in the state is marijuana. This crop is sold predominantly on the black market, where no taxes are collected.
In 2015, several proposals are being floated to legalize and regulate marijuana sales so that the state can benefit from the resulting tax revenues. Widely varying claims are being circulated in the media for the amount of tax revenues that will be generated from the legalization of pot. In addition to sales taxes, various "vice taxes" could be imposed on pot sales, as is already done with alcohol and tobacco products.
For this assignment, students are to build a model predicting the amount of tax revenue that would be collected by the state of California if marijuana were to be legalized and taxed. Numerous factors come into play, and the problem provides an excellent problem to identify what those factors are, how they interrelate, and how to quantify them. It is also a nice exercise for uncertainty assessment, since most the relevant factors are highly uncertain, due partly to the underground nature of the business historically.
The first level of the assignment should focus on the direct financial impacts. The assignment can also be extended to include the modeling of social costs (and rewards) incurred from the legalization. On the one hand, there could be savings from decreases in jail time served for pot-related offenses, yet there may be increases in usage and hence increased costs in medical or social services. A focus up to this point can still remain on a purely monetary impact, separate from moral issues. Models should allow an exploration of policy options -- such as how different tax levels impacts net tax revenue. It should also make uncertainty in the forecasts explicit.
Finally, a third angle on this assignment (usually after the above monetary analyses are complete) is to incorporate non-monetary moral factors. Students will have widely differing views on what moral considerations exist and how significant they are. This extension to the assignment opens up topics related to combining very non-quantitative objectives with quantitative ones.
Will Pension Plan bankrupt San Jose?
In this problem, students are provided with several facts, sufficient to fill in most the model details without requiring independent research. Yet they must still graple with arranging the facts into a quantitatively productive model structure.
The police and fire fighters of the City of San Jose, California, have an exceptionally generous pension plan. When they retire, they receive an annual pension of 90% of their salary at the time of their retirement, plus a 3% per year increase. Police and fire fighters are eligible to retire at age 55 with 20 years of service, at age 50 with 25 years of service, or at any age with 30 years of service. Military veterans receive 4 years of service credit (i.e., can retire 4 years earlier).
Some tax payers think this is another egregious misuse of government funds. Your assignment is to analyze the actual cost of this pension plan now, and into the future as existing police and fire fighters retire (and as retirees die off).
Here are some facts from the San Jose Mercury News, 12 July 2009:
- $111,260 -- Average base pay for San Jose police officers and firefighters.
- $98,541 -- Average pension for the 90 public safety workers who retired last year.
- $73,818 -- Average pension for San Jose's 1,422 retired police officers and firefighters.
Students can also use these numbers:
- Number of fire fighters: 750
- Number of police officers: 1300
Students may make the following assumptions for the purpose of the analysis:
- Inflation of 3% annually
- That average salary levels increase at the rate of inflation, 3% annually.
- Staffing levels remain constant into the future. Those who retire are replaced by recruits.
- Officers retire immediately when they are eligible.
- 40% of officers are veterans, same percentage applies to recruits.
- Age of veterans joining the force is Uniform(24,34)
- Age of non-veterans joining the force is Uniform(20,34)
- Current age of officers is roughly uniform between the recruitment and retirement ages. Students will have to find a reasonable representation of this that fits with the notion that they retire when eligible.
- Attrition is negligible, doesn't need to be included in model. All officers will work until retirement.
- Life expectancy after retirement is Normal(72,8)
- Spousal surviver benefits can be ignored (or you can view these as built into the life expectancy assessment).
The Student's model should project the total (inflation-adjusted) retiree payments (the cost) of the pension plan over the next 40 years. The inflation-adjusted cost places the future cost in 2009 dollars. As more officers reach retirement age, it is pretty clear that the pension cost increases, but of interest is by how much. Is this sustainable, or will it eventually force the city of San Jose into bankrupcy?
This provides an example of how a quantitative model might have been very useful if it had been built prior to the union contract negotiations.
Cash for Clunkers
The 2009 Cash for Clunkers program in the United States received tremendous publicity and was called a overwhelming success for the environment. The assignment here is to build a model that estimate the net impact on greenhouse gas emissions.
The Cash For Clunkers program was a U.S. federal government program, subsidized by $3,000,000,000, which provided cash rebates of $3,500 or $4,500 to people who purchase or lease a new car, when those consumers turned in a used, less-gas-efficient car to be scrapped. The used car had to be fully functional and less than 25 years old. The used cars were unconditionally scrapped -- removed from circulation, reduced to scrap metal.
When the new car is rated between 4 and 9 mpg higher than the trade-in, a credit of $3,500 is paid. When the new car is rated for 10 mpg or higher above the trade-in, $4,500 is paid.
Several factors figure into the actual environmental impact. For this assignment, for simplicity, students are to examine only the impact on over-all carbon dioxide emissions. The improved gas mileage of the new car leads to decreased CO2 emissions, of course. But substantial energy is also expended to scrap the old car and build the new one, which may include many direct and indirect inputs. The model builder may need to conduct some research on the various energy inputs required to construct and scrap (without any reuse of parts) automobiles, and compare these negative environmental impacts to the positive savings from improved gas mileage.
The student can start with one specific example (the model can be initially constructed so that the exact parameters are inputs that can be changed). Suppose Joe drives a 15 mpg pickup truck, which he would have tolerated for another 4 years and 60K miles had he not scrapped it for cash. He upgrades to an equivalent 20mpg new truck. His vehicles typically last for 8 years, with 15K miles per year. Amortize emissions for any new vehicle construction or scrapping over an 8 year period. Compute the cumulative CO2 offset to the environment over a 10 year period in the two scenarios where he upgrades now, versus where he continues the use of his clunker for 4 more years before upgrading.
Could stray planetary bodies in open space be detected?
Are there planet-sized dark objects (rocks, ice cubes) populating the vastness of empty space between visible stars? Is it possible that such ordinary matter actually constitutes a substantial fraction of our galaxy's mass? Or of the universe's mass? Would there be a way to detect these objects?
The Kepler satellite is monitoring the light intensity of thousands of distant stars, looking for periodic decreases in light intensity as evidence of an orbiting exoplanet. If a stray inter-stellar dark body were to move between the earth and one of these stars, it would cause a one-time dimming event. If it were close enough to earth, it could totally eclipse the star, causing a drop of light intensity to zero.
Ignoring the complication that it might be hard to separate such one-time events from other sources of random noise, there is a question of how probable such a crossing would be? It would require the dark body to move precisely across the line from us to the star, and for a total eclipse would require it to be much closer to earth than to the distant star. We'll suppose that any crossing that occludes some light is a theoretically detectable event.
Create a model to predict the probability of a detectable event (both a dimming event and a total eclipse event) when one star is being monitored, given various input assumptions about the density of dark planetary bodies in open space, the size distribuiton of these bodies (relative to the radius of the star being monitored), the distribution of speeds of these bodies, and the distance to the star being monitored.