Need Help With This Assignment?

Let Our Team of Professional Writers Write a PLAGIARISM-FREE Paper for You!

Descriptive Statistics and Reporting Results

Descriptive Statistics and Reporting Results

Overview

In our current modernized society, humans conduct most of their activities digitally. Technological digital devices and the internet facilitate daily activities such as shopping, social interactions, information search, and entertainment. It enhances the easy recording and analysis of such information and births an era of computational science, online marketing, and personalized search engines (Kosinski et al., 2013). Such records make it possible to distinguish recorded data and information that can be predicted through statistical analysis of such records.

Do you need an original copy of “Descriptive Statistics and Reporting Results”? Contact us

Sometimes, people may choose not to reveal characteristics such as age and gender. Still, this information can be obtained directly from analyzing other life traits they unknowingly show. Therefore, this study seeks to establish how fundamental human trait tracks can help predict personal attributes some individuals privatize. The study uses Facebook likes and other records from Twitter and web browsers.

Figure 1: Descriptive Statistics and Reporting Results

The figure presents the study design of the selected attributes and traits for predictive analysis. These include religion, sex, personality, origin, and life satisfaction. Other characteristics such as drug and substance abuse, political views, and relationship status are also used to make inferences. The statistical tables present the study outcomes. The participants were a sample of 58,466 volunteers from the US. Their Facebook likes, and information from their profiles were obtained and compiled.

The User-like Matrix provides data for the possibility of an association between the character and his like, where 1 indicates an association, and 0 is no association. Linear regression predicts numeric variables for aspects such as age and intelligence, while logistic regression predicts other qualitative variables such as sex and gender. The User-like Matrix is easy and explicit. However, the data may be sophisticated for a typical reader to understand due to the specialty and complexity of the methods used. SVD and 10-fold cross-validation are unique techniques for data presentation.

An alternative method of presenting the data frame would be a simple table showing the codes 0 and 1 and the variables they represent. Cross-tabulation analysis results from statistical software such as SPSS could more vividly depict the association and correlation (Field, 2018). Since the figure represents the study design, few conclusions can be made from the statistics. For instance, the described components are justified by the few numbers of participants from which the information was drawn.

Figure 2: Dichotomous Variables Prediction

The graph presents the probable accuracy for predicting dichotomous variables. It is done regarding the characteristic curve of the area under the receiver. It measures the likelihood of accurately classifying two users selected randomly from a class, for instance, single vs. in a relationship. The graph includes all the relevant data required for analyzing the aspect of perfect classification. The choice of colors to represent the different variables makes the figure appealing and understandable to the reader.

Due to the uniqueness of the aspect being probed, the method of presentation best fits the intended output. Another technique for which the data model might be tested for prediction accuracy is obtaining the Mean Bias Error (MBE), which indicates whether the estimations are overestimated or underestimated. When a negative MBE value is obtained, it means underestimation. It can be derived from the figure that Caucasian and African Americans were almost accurately classified with 95% probability, followed by gender at 93% (Kosinski et al., 2013). It implied a significant difference between the groups’ behaviors is expressed by the likes enhancing an almost perfect classification.

Figure 3: Numeric Variables Prediction

The graph shows the correctness of predicting numeric values between the actual and predicted, as depicted by the correlation coefficient from Pearson’s product-moment—the Facebook friendship network. The psychological traits were approximately measured using questionnaires since they cannot be measured directly (Kosinski et al., 2013). The transparent bars based on test-retest reliability scores show the used questionnaires accuracy. All the correlations are calculated at a significant level of p<0.001. The graph sufficiently predicts the data, and the representation is understandable even by a typical reader.

We can derive from the figure that openness is almost accurately predicted since the predicted value (r = 0.43) lies close to the test-retest reliability test score (r = 0.5). It denotes that somewhat observing the user’s likes informs about the user’s fundamental openness trait (Kosinski et al., 2013). Age depicted the highest correlation with r = 0.75, density with r = 0.52, and size with 0.47. The predictions with the lowest accuracy were satisfaction with r = 0.17 and contentiousness with r = 0.29. For these behaviors, the prediction correctness is half the figure from test-re-test reliability scores.

Figure 4: Data Amount Available and the Accuracy of Prediction

The graph presents participants’ results with likes ranging from 1 to 700. The median number of likes per individual was 68, with an Interquartile Range equal to 152. The graph attempts to establish the change in prediction accuracy in the number of obtained likes given and the expected accuracy of a particular participant. A sample of n = 500 for users with 300 likes was selected, and then running predictive models based on subsets 1,2 and 300 were selected randomly (Kosinski et al., 2013). The line graphs do not indicate the correlation coefficients for people with different likes.

Bar charts indicating the corresponding likes and their coefficients could better present the data. However, the graph clearly shows the trend of the observed likes per group (gender, age, and openness) and Pearson’s correlation coefficient. We can derive from the graph that about 50% of the users had a minimum of 100 likes, whereas about 20% had a minimum of 250 likes each. Dobin et al. (2008) suggest applying a model-based approach as an alternative method that can be used to describe the amount of data (sample) required to yield a sure prediction accuracy.

Other Related Post: Descriptive Statistics and Reporting Results

References

Dobbin K., Zhao Y., Simon R, (2008): How Large a Training Set is Needed to Develop a Classifier for Microarray Data? Clinical Cancer Research. 14 (1): 108-114. 10.1158/1078-0432.CCR-07-0443.

Field A. (2018) Discovering Statistics Using Ibm Spss Statistics, Sage Publishers, California.

Kosinski, M., Stillwell, D., & Graepel, T. (2013). Private Traits And Attributes Are Predictable From Digital Records Of Human Behavior. Proceedings Of The National Academy Of Sciences110(15), 5802-5805. doi: 10.1073/pnas.1218772110

ORDER A PLAGIARISM-FREE PAPER HERE

We’ll write everything from scratch

Question 


Session 1 Research Assignment

Descriptive Statistics and Reporting Results

Manipulating and describing statistics so that they will be readily understood are basic but required skills for the researcher. Academic researchers may tolerate uninspiring text-based presentations if necessary (but do not need to). However, in the business world, visuals and graphic presentations are a requirement. At the top of the presentation tools arsenal are those beautiful color charts and graphs that help explain the relationships in the numbers.

Descriptive Statistics and Reporting ResultsDescriptive Statistics and Reporting Results

Descriptive Statistics and Reporting Results

In their article entitled “Private traits and attributes are predictable from digital records of human behavior,” Kosinski, Stillwell, and Graepel (2013) use the Facebook “Likes” of over 58,000 volunteers in the United States to see if they can make predictions about them (and various sub-groups) based on those Likes. This article blends academic and business research excitingly and productively for decision-makers in both categories. The mechanics of setting up and manipulating such a large data set (big data) are outside the scope of this course, but this is an important emerging field for supporting decision-making of all types. The terabytes of information now available require ever more sophisticated ways to store, access, categorize, analyze, slice, and dice.

This article uses Facebook data to test their hypotheses and present their results. Even with only four figures (a flow chart, two bar charts, and a line graph), this article is an excellent example of how descriptive statistics can be used depending on the questions of interest. Note that some graphic descriptions or presentations are better suited for certain types of information (data) than others. As an aside, why might this data set interest a business researcher? The authors explain in their opening paragraph:

A growing proportion of human activities, such as social interactions, entertainment, shopping, and gathering information, are now mediated by digital services and devices. Such digitally mediated behaviors can quickly be recorded and analyzed, fueling the emergence of computational social science and new services such as personalized search engines, recommender systems, and targeted online marketing. (Kosinski, et al, 2013)

For this assignment, read the article Private Traits and Attributes Are Predictable from Digital Records of Human Behavior. Then answer the following questions for each of Figures 1, 2, 3, and 4: Please paste the website into Google to obtain the article; thank you. https://www.pnas.org/content/110/15/5802.full

  1. What type of graphic (descriptive statistical representation) is it, and what data/information does it represent?
  2. Does the graphic represent the data well (is it easy to understand)? Why or why not? (E.g., Think of the type of graphic it is, color choice and placement, data that is included versus excluded; if possible, will it be clear to the target reader, etc.)
  3. What are some other ways that the data might be presented?
  4. What ideas or conclusions can you draw from the graphic?

Your paper should be 3-4 pages long, not including cover and reference pages, and be formatted according to APA guidelines.

Reference

Kosinski, M., Stillwell, D., & Graepel, T. (2013). Private traits and attributes are predictable from digital records of human behavior. Proceedings of the National Academy of Sciences, 110(15). Retrieved from http://www.pnas.org/content/110/15/5802.full