Fun with Python: an Input Game

I know, I know. It’s not Mathematica, but please bear with me.

Python vs. Mathematica

For starters, it wouldn’t be fair for me to compare them for several reasons:

  1. I have years of experience with Mathematica vs. the few weeks I’ve been working with Python.
  2. The two languages were created for very different purposes.

I’ll try anyway, if only for fun.

It’s interesting to see how different it is from Mathematica. For example, whitespace characters (i.e. tabs) actually have a function. Unlike Mathematica, I can’t haphazardly put enters wherever I want, nor is it automatic! I also need to keep my code well-organized because it takes so many more lines to write what could be one function in Mathematica, and Canopy (my Python interpreter) lacks the organization and aesthetic beauty of notebooks.

pythoninputs1.png
Compare and contrast.

At the same time, Python seems to be much faster than Mathematica at running code, even though both are higher level languages; they probably wouldn’t compare with something like C in terms of speed. And although Mathematica is more intuitive with pulling off calculations and visualizing data, Python has much greater flexibility and readability for other kinds of tasks. There probably many more things Python can do that I’m not aware of. Otherwise, why would it be so popular?

Defining Essential Functions

Many of Python’s more useful functions come in modules. We should all thank the generous individuals who have created these functions for us. We will use the sys module for the function sys.exit().

import sys

This program relies on counting the number of characters in each input and determining whether they are odd or even and, in the case of the user’s name, whether it is prime. We could use len() by itself, but the problem is that it does not subtract spaces. We can define a simple function named character() just in case someone uses their full name.

def character(word):
    return len(word) - word.count(' ')

The next step is to define a function that can determine whether a number is prime. Technically, we could use an existing library of prime numbers, which would be faster. Instead of importing a library, how about we make something that can check if a number is prime on the go? Let’s make a series of tests:

  1. Is the number (we’ll call it “n”) less than 2? If so, it is not prime. This automatically excludes 0 and 1. Negative numbers cannot be prime, either, but that does not matter because we cannot have a string of negative length.
  2. If n is greater than 2, is it divisible every number between 2 and itself? A prime number is defined as something that can only be divided by 1 and itself. We’ll use a for loop to test every number. Remember that range() does not include the second parameter/final number.
  3. If n passes the above tests, return True. Otherwise, return False.
def primeQ(n):
    if n < 2:
        return False
    else:
        for i in range(2,n):
            if n % i == 0:
               return False
        return True
Creating the Story

Now that everything is in place, we can begin making the actual program. Start by taking the user’s name, birth month, and favorite color. Use raw_input() to turn the input into a string; using input() will try to run the input as a function. For example, entering “Ellen” as the name will store a string for raw_input() but will cause an error for input() if there is no variable called “Ellen.” Use the character() function we defined earlier to store the number of letters in each input as a variable.

name = raw_input('Enter your name: ')
nameLetters = character(name)
month = raw_input('Enter your birth month: ')
monthLetters = character(month)
color = raw_input('Enter your favorite color: ')
colorLetters = character(color)

Then, determine if the number of characters in “name” is prime, odd, or even.Because we’ll need this information later, we’ll create a variable to store a value that shows what kind of number it is.

if primeQ(nameLetters) == True:
    nameValue = 1
elif nameLetters % 2 == 0:
    nameValue = 2
else:
    nameValue = 3

Before we create the story, we’ll have to make sure the user entered the right information.

  1. Ask the user if the information they entered is correct in a way that is readable with rawinput(). Because the user could answer “yes” or “no” in many different ways, ask them to enter “Y” or “N.”
  2. If the answer is “Y” for “yes,” continue to the story.
  3. If the answer is “N” for “no,” end the program. We could use quit() or exit(), but it is better to explicitly import the sys module and use sys.exit(), which functions pretty much the same way. Using quit() or exit() relies on the site module, which might not always be there, unlike sys.
answer = raw_input('This is what you entered:\n'
+ 'name: ' + name + '\nmonth: ' + month + '\ncolor: ' + color
+ '\nIs this correct? Enter Y or N.\n')

if answer == 'Y':
    print('Beginning...\n')
else:
    print('Sorry. Please try again.')
    sys.exit()

The rest of the code is a bunch of if statements that check whether monthLetters() and colorLetters are odd or even. For example, the following code checks whether the remainder of colorLetters and 2 is zero (if it’s even). If so, it prints a certain line of text that incorporates the user’s name. A different line of text is produced if it is odd.

if colorLetters % 2 == 0:
    print(name + ' is removed from school for their concerningly erratic behavior. '
    + name + ' burns down the psychiatric ward and escapes to Europe with a fake identity.')
else:
    print(name + "'s teachers say that they are gifted. " + name + ' becomes a young oil tycoon'
    + ' and becomes the youngest billionaire in the world. Climate change worsens.')

Finally, we can test the code. Here’s a run with my information (the bolded words are my inputs).


Enter your name: Ellen

Enter your birth month: February

Enter your favorite color: purple

This is what you entered:
name: Ellen
month: February
color: purple
Is this correct? Enter Y or N.
Y
Beginning…

Ellen is synthetically created on a cloudy February day.
Ellen is actually a robot. Ellen is mistakenly adopted by their current family.
Ellen is removed from school for their concerningly erratic behavior. Ellen burns down the psychiatric ward and escapes to Europe with a fake identity.
Ellen gets married and has 6 children.
Ellen pokes a button and accidentally kills 6.5 billion people. The world is in a crisis.
Ellen falls in a pool and dies at the age of 72.
Ellen will be greatly missed. The end.


As usual, everything is here, on Pastebin. There’s a collapsed preformatted block of text at the bottom. Look! Now that I’m using Python, I can take full advantage of the preformatted function on WordPress!

"""
This project outputs a story based on the character length of their inputs.
Visit my WordPress at ellenleescience.wordpress.com
"""
import sys
def character(word):
    return len(word) - word.count(' ')
    # subtract the number of empty spaces just in case someone puts their full name
def primeQ(n):
    # we need to find whether the number of letters if prime for later
    # 1 and 0 are not prime, and negatives are impossible
    if n < 2:
        return False
    else:
        # divide n by every number from 2 to n, since a prime can only be divided by 1 itself
        for i in range(2,n):
            # if the remainder is zero for any number, it is not a prime number
            if n % i == 0:
               return False
        return True
        # if n passes every test, it is prime

name = raw_input('Enter your name: ')
nameLetters = character(name)
month = raw_input('Enter your birth month: ')
monthLetters = character(month)
color = raw_input('Enter your favorite color: ')
colorLetters = character(color)

# now we determine if nameLetters is prime for later
if primeQ(nameLetters) == True:
    nameValue = 1
elif nameLetters % 2 == 0:
    nameValue = 2
else:
    nameValue = 3

answer = raw_input('This is what you entered:\n'
+ 'name: ' + name + '\nmonth: ' + month + '\ncolor: ' + color
+ '\nIs this correct? Enter Y or N.\n')

if answer == 'Y':
    print('Beginning...\n')
else:
    print('Sorry. Please try again.')
    sys.exit()

# the rest of the story is determined by nameValue and whether each Letters value is odd or even
if nameValue == 1:
    print(name + ' is synthetically created on a cloudy ' + month + ' day.')
    print(name + ' is actually a robot. ' + name + ' is mistakenly adopted by their current family.')
elif nameValue == 2:
    print(name + ' is born on a stormy ' + month + ' night.')
    print(name + ' is abandoned by their family. They are raised by stray dogs.')
else:
    print(name + ' is born on a sunny ' + month + ' day.')
    print(name + ' lives with a happy family until their parents die in an accident in the lab.')

if colorLetters % 2 == 0:
    print(name + ' is removed from school for their concerningly erratic behavior. '
    + name + ' burns down the psychiatric ward and escapes to Europe with a fake identity.')
else:
    print(name + "'s teachers say that they are gifted. " + name + ' becomes a young oil tycoon'
    + ' and becomes the youngest billionaire in the world. Climate change worsens.')

print(name + ' gets married and has ' + str(colorLetters) + ' children.')
if monthLetters % 2 == 0:
    print(name + ' pokes a button and accidentally kills 6.5 billion people. The world is in a crisis.')
else:
    print(name + ' finds a cure to rabies, but their research is stolen. They never recieve credit.')

if nameValue != 1:
    print(name + ' lives to the ripe age of ' + str(monthLetters*9) + '.')
    print("Cause of death:")
    if colorLetters % 2 == 0:
        print('tripped on a banana peel')
    else:
        print('hit by a meteor')
else:
    print(name + ' falls in a pool and dies at the age of ' + str(monthLetters*9) + '.')

print(name + ' will be greatly missed. The end.')

P.S.: How does the Python logo work?

Drawing Maps in Mathematica

I thought I’d take a shot at drawing some maps because maps are cool, right?

We used to use CountryData[] to get the coordinates (which needed to be flipped so the image wouldn’t end up sideways) of the borders of each country to draw them. Because we are not barbarians, we will use GeoGraphics[] to draw each country.

Basic Principles

Entering the command by itself will give you an image of your location. For example:

drawingmaps1

…an image of Southern California! And you can change the styles of the maps by using the GeoBackground option. Use “StreetMapNoLabels” instead of “StreetMap” if you do not want bloated letters floating on your map.drawingmaps8.png

Fancier Techniques

GeoGraphics[] has many more sophisticated capabilities. Before I jump in, here’s a link to all of my code. I used to following variables.

drawingmaps2

We can plot several countries at once. Here’s a map with Canada, the United States, and Mexico. Color any country by using FaceForm[<replace bolded with color>]; colors do not need quotation marks. I recommend using EdgeForm[Black] because the colors tend to be light due to their opacity. You can tamper with the opacity and make them solid masses on the map, but then you will not be able to see the background as easily. For some reason, Alaska and Hawaii are excluded in the United States’ polygon data for GeoGraphics[].

drawingmaps3

We can make the map even fancier by putting the flags of each country on their designated locations. Instead of individually importing each flag, use area[“Flag”] to retrieve the image and superimpose it on the map with GeoStyling[].

If you want more demonstrations of flag maps, check out this official page from Wolfram.

drawingmaps4

It looks great, but it’s just a smidgen away from perfection. Shall we replace the flags with more appropriate images?

drawingmaps5

Muahahaha! There is no escape from current events, even on this obscure corner of the internet. This is a joke, by the way. Please do not take this seriously.

Celestial Bodies

To drown out my worries about the future of my country, let’s shift gears to something more exciting: astronomy. You can’t go wrong with outer space!

Unfortunately, Mathematica has limited functionality when it comes to drawing maps of outer space. I mean, relative to what it can do with other things. It is still possible to make awesome space maps. How about this comparison of the Moon’s Webb Crater and the Manicouagan Reservoir in Quebec? (P.S. GraphicsRow is great if you have multiple images and want to set them to the same size.)drawingmaps6

I know, the second image looks like some kind of lunar pimple more than a crater. It’s a common optical illusion, making it easy to think that the Moon has many mountains. The Manicouagan Reservoir is what’s left of a crater formed by a 5 km asteroid. The significantly more unstable weathering  conditions on Earth have clearly produced an impact on the two craters.

The Moon is the most detailed celestial body you can create maps of in Mathematica, excluding the Earth. For example, try getting a picture of Jupiter’s moon Europa:

drawingmaps7

Not looking to hot. Zooming in only yields a pixelated mess. That’s because we have very few close up images of Europa and we have not explored it too much (yet). I hope we will one day be able to send satellites there to see what lies in Europa’s vast oceans. Who knows what we’ll find there, living or not?

drawingmaps9

College Application Survey Results

Continuation from the last post.

Thanks to /r/applyingtocollege, I was able to amass a great amount of data.

Results

See the full results for yourself here.

A whopping 40.4% of the responses came from California! According to the US Census Bureau, only about 12% of the responses should have come from here. This may have had to do with the fact that this survey was posted around 9:30PM, and because differences in time zones most of the US west coast (read: California) would have had a chance to respond.

collegesurvey1.png

Another thing that caught me off guard was that there were people applying to 5+ schools early – 29, in fact. Perhaps the people on the much higher end were looking to apply to all of their schools early and get their results early?

collegesurvey2.png

And the confidence scale was also quite interesting. It’s odd that the distribution, although skewed, looks roughly normal, with small spikes for 1 and 10. I wonder what made people pick 7 instead of 6 or 8? I was the first person to test the survey and I picked 7 as well. Maybe it’s because 7 isn’t as close to the middle as 6 is, but is still sufficiently far away from 10 to demonstrate some level of modesty or cautiousness. Another thing: people who left comments were more likely to score lower on the confidence scale (but most of the comments were “i’m dying lol” or “VERY STRESSFUL” so that makes sense). More confident people were less likely to leave comments.

collegesurvey3.png

Possible Sources of Error

Well, the kind of students who go out of their way to find and interact with a community focused on college applications probably have enough initiative to go to more competitive schools. As a result, many would feel the need to apply to many colleges because the acceptance rate is so incredibly low, especially for early applicants. I wonder how the results would have changed if I posted the survey after early decisions were released; the second graph would have most likely been even more skewed to the left than it already is.

In hindsight, I should have allowed short answers for the numbers and given instructions to type a number. Then, I would have been able to import the data from a spreadsheet and make the graph myself.

Final Comments

College applications can be the most stressful time of the year, especially if one does not manage their time well by not starting the UC application until four days before the deadline. At least we all know that there are hundreds of thousands of other individuals who are going through the same experience – each handling it differently – and that we can all suffer together.

On a side note, pie charts are terrible for displaying population distribution (something I will fix at a later date 🙂 ).

Only marginally related: The most stressful time for me was watching this…

collegesurvey4.png

…The number of visits increased significantly a week or so before my early decisions were released, and I put this blog on my college applications as a hobby. At least now I know which weeks I was scrutinized by admissions officers.

collegesurvey5.png

Sampling Error in Mathematica

Happy Halloween!

In the spirit of Halloween, it seems that my Mathematica stopped functioning for the past week (too spooky for me :O). Manipulate[], arguably the most useful function in the program, refused to work. It got to the point where something as simple as this

Manipulate[ToString[a], {a, 0, 1}]

would not work at all!

Thankfully, we’re back up and running. Off to sampling errors!

Sampling Sound Waves

Although we often like to think that sampling will always yield accurate results, this often is not the case. Consider the images below:

sampling1

The red points represent how often a point from the function is taken. The first example shows that the predicted red function matches the actual wave. However, the next one shows how it seems as if the amplitude of the function is changing. And the last one shows that by taking samples 2Pi increments away from each other, you get a straight line. It should be noted that the sampling itself is a flaw caused by human error; this error is very much different from the fact that there is “infinite precision” for vector graphs.

Such is the danger of interpolating data without sufficient sampling. An interactive version of the above and the code for the images themselves can be found here.

Sampling errors are present in our daily lives as well! The most intuitive example is with sound. Hypothetically, a higher frequency means a higher pitched sound.

(Note: I actually wasn’t aware that you could play sounds in Mathematica until a little less than a year after I started using it. That just goes to show that there are so many unexplored features of this program… I wonder how many people specialize in fields that I’m not even aware of? I feel as if I am pulling so many layers off this onion that the pile of onion shavings is hundreds of times larger than what would seem possible for its size. /rant)

sampling2

And for a while, the sounds actually do get higher pitched. If you look at the wave with a frequency of 20,000, you can see that Mathematica is struggling.

sampling3But if you suddenly increase the frequency to 256,000, the pitch dramatically decreases. So what gives?

sampling4

This is actually a problem concerning hardware. By trying to play this sound, you’re literally forcing your headphones (or speakers) to vibrate so quickly that there is now way for it to keep up. By trying to oscillate quickly, it downgrades the frequency of its pitch to a harmonic of 256,000; if you want to see a really good example of this, set the interactive code’s sampling value at 5.5.

sampling5.png

Everything suddenly seems lower pitched. If there’s a lesson here, I guess it’s that you should always be careful how you sample… You may even end up with a flat line.

sampling6.png
Probably not how this works, but still.

P.S.: Many of the ideas for this post came from my research mentor Dr. James Choi. I am very grateful for his help!

Moving Statistics (Light Post)

Hello! It’s been a hectic month; we’ve been preparing to move to our new home (not too far from here, it’s just better than our current house). In the meantime, I decided to research data regarding the number of new houses sold and their ft^2 of area.

New Single-Family Houses and ft^2 in the United States

The house we’re moving to has more living space than our current one. I was wondering if the demand for new, larger houses would increase over time. This would be shown by the number of large houses increasing. I got the data* from here; I found that website through data.gov, which is amazing if you need any kind of data, by the way. The code can be found here.

Here’s the number of new houses constructed each year. The second graph shows subgroups for different ranges of living space.

movingstatistics1.png

Unsurprisingly, the number of new houses constructed each year has decreased. As more homes are built, the available land decreases and becomes more expensive. The rate of population growth is also decreasing in the US. (Take a look at population pyramids for post-industrialized nations!) Medium-size houses seem to consistently be the most popular. The demand for larger houses has increased very slightly, as shown by how the curves for 3000+ square feet are higher relative to the other curves.

Western US

Since I live in sunny California, I wanted to see how the western United States is different from the country as a whole.

movingstatistics2

It seems that there was a higher demand for houses in the west before the first dip. I’m willing to bet that the bulk of this was for California’s nice weather, or maybe for the west coast in general. The housing bubble probably caused the sharp raise then decline between 2004 and 2008.

My findings were interesting. I knew that there was a huge decline in house sales after the bubble burst, but I never would have guessed that it was this sharp. It seems that although the number of new homes is starting to rise slightly, it still isn’t near what it was at its peak.

movingstatistics3.png

*NOTE: The data must be transposed because it is in columns, not rows. Here’s how I did it:

areaData = Import[fileName][[1]];
areaDataUS = Transpose[areaData[[10 ;; 26]]][[2 ;; -1]];
areaDataW = Transpose[areaData[[93 ;; 109]]][[2 ;; -1]];

Analyzing the Word Frequency of Donald Trump’s and Hillary Clinton’s Speeches

We all have our stereotypes regarding the two presidential candidates of arguably the worst election in a long time: Donald Trump and Hillary Clinton. When I and many others picture Donald Trump, we think of his promise to wall dividing Mexico and the US (even though it kind of already exists in the form of chain linked fences and intense security); Hillary Clinton seems to conjure thoughts of manipulative e-mails and corporate influence. How do these two candidates compare in terms of their word choice for their speeches?

The Process: Trump’s Speech at the Republican National Convention

The following steps were taken to create a graph and a word cloud for Trump’s RNC Speech. See the notebook here for the code (locked to prevent editing)!

  1. Get the transcript (source for this speech)
  2.  Remove punctuation and unneeded words (articles, auxillary verbs, etc.), then split the words into different lists
  3. Make all of the words lowercase
  4. Tally and sort the data
  5. Generate visuals

speech1.png

Results

The above steps were repeated for several speeches for a total of two per candidate.

Hillary’s DNC Speech (source):

speech3

Trump’s Youngstown, Ohio Immigration Speech (source):

speech2

Hillary’s South Carolina Speech (source)

speech4.png

Both candidates use the words “we” and “our” frequently to reinforce a sense of unity within their respective parties. Oddly enough, Clinton seems to use the pronoun “you” much more often than Trump. They also address each other semi-frequently, most likely because they refute each other’s arguments.

Clinton uses the words “together” and “communities” more often than Trump, who underscores “immigration” and “Terrorism.” This may have been the result of the speeches I chose. Trump also mentions ISIS frequently and focuses on the dangers immigrants may bring. This is consistent with his views of immigration. On the other hand, Clinton focuses more on bringing people together or how her campaign was supported. She probably wants to gain empathy from her audience, focusing on how well they have done as a whole.

…I don’t have much experience with text manipulation and analysis, though. My process may have several mistakes.

It isn’t very surprising that some Americans are looking for a hard-nosed leader like Donald Trump while others are looking for a candidate with somewhat safer ideas like Hillary Clinton (personally, I don’t like either of them very much but to each their own). 2016 is proving to be a disastrous year, though not quite the worst as some claim.

speech5