Monday, February 20, 2017

Assignment 2

Part 1: Cycle Fever


The TOUR de GEOGRAPHIA is fast approaching, and I've got the need for speed. Team ASTANA(TA) is the safe bet to invest in, constantly churning out race winners year after year; however, Team TOBLER(TT) has been quietly making a name for themselves. The following statistics will help to determine the choice to invest in this year.

Range

The range is the difference between the highest and lowest value of each sample size. The range for TA is one hour and ten minutes, while the range for TT is thirty minutes. This shows TA has a few outliers which cause the range to look inflated, while TT still remains as a pack with a relatively small range.

Mean

The mean is the sum of all values, divided by the number of records per sample. TA has a mean of thirty-seven hours and fifty seven minutes. TT has a mean of thirty-eight hours and five minutes. 

Median

The median is the middle point of the values. It has the same number of records higher than it, and lower than it. The median for TA is thirty-eight hours exactly, while TT's median is thirty-eight hours and nine minutes.

Mode

The mode is the value that is most often repeated in the data. TA's mode is thirty-eight hours, and TT's mode is thirty-eight hours and nine minutes.

Kurtosis

Kurtosis is the height comparison of the graph to the normal or gaussian curve. A negative kurtosis is a value less than 1. This means that the peak will be flatter than normal and the data is more spread out. A positive kurtosis is greater than one and means that the peak is higher than the normal and there are more observations closer to the mean. TA's kurtosis is 1.17, while TT's is 2.93.

Skewness

Skewness is the distribution of how the curve of a dataset compares to the normal symmetry. A value of zero means there is no skewness and the curve will look normal. A positive number means that the curve is skewed to the left of the normal curve. A negative number means that the curve is skewed to the right of a normal curve. TA's skewness is -.0026, and TT's skewness is -1.56.

Standard Deviation

Standard deviation is an attempt to discuss the distribution of how observations are clustered around the mean. A standard deviation below one says that the data falls within thirty-four percent of either side of the mean. A standard deviation below 2 describes the data within forty-seven and a half percent of either side of the mean. A standard deviation below 3 explains ninety-nine percent of all observations. TA's standard deviation is 17.49, and TT's is 7.78. (Work is below in figure 1).


Investing

Due to all of the stats, it appears that the teams will exhibit tendencies close to what was predicted. Team ASTANA will produce a relatively average team with a skewness close to zero and a kurtosis very close to one. The range suggests that they will have very low minimum times, indicating a possible individual winner. Team TOBLER will produce a team with a skewness of -1.56 suggesting that they will have a majority of values above the mean, with a high kurtosis to suggest that their values will be clustered around the mean. This would make me think that the team will have times higher than the mean, and will be clustered around there.

All of these statistics make me inclined to invest in Team ASTANA, as they appear to have very high performing individuals, and their team statistics suggest that their team performs better than Team TOBLER. These statistics make me think that I will get a higher return on both the individual winner and the team funds.


Figure 1: Hand work done on standard deviation.

Part 2: Wisconsin's Center



Figure 2: Map depicting geographic mean center, and weighted population centers of 2000 and 2015.


This map places the geographic mean center of the state in the center. It also places a population weighted center for the years of 2000 and 2015 on the map. One possible explanation for the movement of the weighted population center is the loss of population in the Madison, and Milwaukee areas. This would help to explain why the point moves NW from the cities in 2015 as they lost population weight.

Wednesday, February 1, 2017

Assignment 1

Part I: Data Types

Nominal Data

Nominal data is all based on the name. Things such as state name, FID, or any other unique identifier is what the data classification method is based off of. 

Figure 1: Map of all of the Counties Within Wisconsin.


Figure 1 (1) above shows all of the counties in Wisconsin. The data is based on the county names, and separates the data based off of each county's name.

Ordinal Data

Ordinal data is any data that is ranked in any certain way. Things such as grade level (1-12), hurricanes (1-5), and many others all follow this method. If the data can be ranked, it is ordinal.

Figure 2: Ordinal data displayed across the World shows the rank of concern due to acts that violate various norms.

Figure 2 (2) shows ordinal data in the form of varying degrees of violations of political, economical, and use force norms across the countries of the world. It is ordered based on the counts of violations, displayed in darker shades of red as the number of violations gets higher.

Interval Data

Interval data is all the relationship of distance from one variable to another with no true set zero. It does not rely on a set scale, and can be used for multiple variables. Things such as temperature have no set scale of measurement. Some countries used Celsius, some use Fahrenheit. It is all about the relationship of the distance of measurement in interval data.

Figure 3: The map above displays temperature values for different regions across the U.S. in Fahrenheit. 
Figure 3 (3) shows the varying temperatures across the conterminous United States. The data scale is all set on the Fahrenheit temperature scale, which measures things differently than Celsius.

Ratio Data

Ratio data is very similar to interval data, except that it has a set zero for measurement. Some examples of ratio data weight and height. They both have a set zero starting point, and then can only get bigger from there. A common example is percentages. Things can't get less than 0%, or larger than 100%. Figure 4 shows this clearly (4).

Figure 4: The map above shows the influence humans have had on the natural land in the United States. It measures things in terms of how much natural land is left untouched.

Part II: 

    An important facet of a well functioning society is gender equality. With gender equality comes more opportunities for every member of the society, and thus create more economic stimulus for the economy as a whole. In Wisconsin, there are over 7,100 farms where the primary operator is a female. This is a good step, but there is much work to be done to ensure the state as a whole becomes much more balanced in terms of women's rights, and educating women to help them become principle operators of a farm themselves if they wish.

    The first map in the series is created using equal interval breaks. This creates evenly spaced groups and allows the viewer to see a general overview of the spread of the data throughout the entire range. When looking at this map, one can see that both the central sands region of the state and the northern portion of the state have the fewest number of principle female farmers. This makes sense, because of the sandy soil, and forested areas; however, the central portion of the state holds massive potential for principle female farmers.

Figure 5: Map of principle female farmers in the state of Wisconsin by county.
The red outline displays the developing study area.


    The next map is done using the natural breaks method. This creates the five most "natural" breaks throughout the data to create groups that are inherently displayed within the data already, counties for this map. This map shows that the central region of Wisconsin shows tremendous opportunity for growth among principle women farmers. The counties selected have surrounding counties that have high populations of principle female farmers, so as soon as the program would be initiated, the communities would rally around it and become stronger in the process.

Figure 6: Map of principle female farmers in the state of Wisconsin using the natural breaks method. The red outline displays the developing study area.


    The final map further illustrates the point made by the last map. It was created by classifying the data so that each grouping had the same amount of features. This further narrows the search for an area to begin our work to the smaller 6 county region in central Wisconsin. It shows that there's an "island" of counties that don't have the same amount of principle female farmers as the ones around them. 
    These 6 counties should be the base of our work, and if successful we can move west and try to increase awareness from the study area to the Mississippi River too.

Figure 7: The map above displays the principle female farmers with a quantitative classification. The red outline displays the starting study area.






Citations


1) https://www.presentationmall.com/wp-content/uploads/wi-multicolor.jpg
2) http://vmrhudson.org/SOCIC07color.jpg
3) http://www.smu.edu/-/media/Site/Dedman/Academics/Programs/Geothermal-Lab/Graphics/TemperatureMaps/surfacetemp.ashx?la=en

Data: https://www.agcensus.usda.gov/Publications/2012/Full_Report/Volume_1,_Chapter_2_County_Level/Wisconsin/st55_2_047_047.pdf