Chapter 1. Introduction

Music plays a powerful role in societies, in cultures, as well as in the personal lives of many people. It represents and shapes societies and cultural identities, depicting their unique traditions and values. In people’s daily lives, music accompanies the listener as he does his daily tasks—while traveling, shopping, studying, and even when working. People listen to music to relax after a stressful day, or dance to music as a form of exercise. People also search through their tracks and find that one song that will make them feel good or help them release their negative emotions. When people come together, as in funerals, weddings, birthdays, and other celebrations, music is omnipresent. In all of these instances, it is obvious that music plays a significant role in this world.

Over the years, people have listened to music via different formats. With the emergence of a new medium, came the deterioration of older ones. Audio formats have evolved greatly since they were first introduced to the market. About a century ago, music was available through bigger media, such as the vinyl record. This medium, which is one of the oldest audio formats today, has been played since the 1880s (Vinyland.com, 2018). It was very long before a new music medium was invented and introduced in the music industry. In the early 1960s, the audio cassettes were introduced to consumers, and its sales slowly increased. After a few years, it outsold vinyl records and continued to sell more units, until another audio format, the compact discs, were introduced in the 1980s.

Since the invention of the big vinyl records, the medium through which music is stored eventually became smaller, more accessible, and more portable until they were just virtually available. From vinyl records, to audio cassettes, compact discs, and to mp3s, music continues to play a significant part in the lives of many people, although it is stored in various formats. Today, very few people listen to music stored in physical formats like audio cassettes, vinyl records, and compact discs. Instead, people purchase music from various websites, apps, and subscriptions such as iTunes and Spotify. Many people also illegally download songs from the internet for free.

As of 2017, RIAA data shows that 83% of all music sales volume is comprised of downloaded albums and singles. Downloaded singles account for 74.5% of all music sold, but RIAA data shows that this declined by 25.5% from the previous year. Moreover, compact disc sales volumes in 2017 accounted for 11.8% of all music formats sold, which is a 10.3% decrease from 2016. Finally, while vinyl records may have existed for more than a century already, it is evident that this medium is timeless and continues to live on, although it is no longer as patronized as before. As of 2017, vinyl records contribute 2.1% of all music sold in the United States. Surprisingly, this is a 5.3% increase from the 2016 sales volume.

Objectives of the Study

Over the years, the world has witnessed how one audio format slowly rises to fame, reaches its peak, and then gradually declines, to be outsold by a new and better format. This is true for many media, such as vinyl records, audio cassettes, compact discs, and others. Many factors contribute to the rise and decline of these physical and virtual formats. Oftentimes, the existence of a new medium causes the decline of the others. This paper aims to understand the rise and decline of compact declines. More specifically, this paper aims to answer the following research questions:

Are the sales volumes of vinyl records, audio cassettes, and compact discs associated with each other?

Can the sales volume of compact discs be explained by the sales volume of vinyl records and audio cassettes?

In order to answer these two research questions, the following objectives have been developed:

Quantify the association of the sales volumes of vinyl records, audio cassettes, and compact discs using pairwise correlation coefficient; and

Develop a statistical model of the sales volume of compact discs as a function of the sales volumes of audio cassettes and vinyl records.

Scope and Limitations

This study focused solely on compact disc sales volume and how it is affected by the sales volume of audio cassettes and vinyl records. This study did not investigate the revenues or sales of these formats, which may or may not lead to different correlation and multiple regression results. Moreover, this study does not investigate compact disc sales volumes as a function of the sales volumes of newer audio formats such as downloaded music, DVDs, music videos, and others.

In addition, the data that was used in this study was taken from the website of the Recording Industry Association of America, and may vary from other data sources. Due to limited data on the RIAA, this study performed descriptive analysis on a total of 45 data points. Correlation analysis and least squares estimation, on the other hand, were conducted on a much more limited data set consisting of a total of 26 data points. Because of limited availability of data, this study could only provide insights into the trends of sales volumes beginning in 1973. Hence, it could only provide the complete picture of the number of compact discs sold since its introduction to the market, but not for vinyl records and audio cassettes.

Chapter 2. Methodology

This study aimed to identify an association between the sales of compact discs, audio cassettes, and vinyl records. This study is also aimed at developing a statistical model that will explain the factors that drive the sales of compact discs. Specifically, this study determined if compact disc sales may be explained by the variations in the sales of both audio cassettes and vinyl records.

To achieve these objectives, data was obtained from the website of the Recording Industry Association of America (https://www.riaa.com/u-s-sales-database/). The data includes sales volume of vinyl records from 1973 until 2006; audio cassettes from 1973 until 2004; and of compact discs from 1983 until 2017. In total, the data set consists of 45 different observations. In analyzing this data, R programming software was used.

Descriptive Statistics

Prior to performing a correlation analysis and multiple linear regression, descriptive statistics on each of the three variables are first presented via tables and graphs. Graphs are used to provide a visual representation of the trend of the sales volume of each of the three audio formats over time. These figures make it easy to see the rise and decline of CDs, cassettes, and vinyl record sales. In addition to the graphs, measures of central tendencies such as the mean and median are summarized in this paper. Measures of variation such as the minimum, maximum, range, standard deviation, as well as the variance are also presented in the Results and Discussion section. The measures of central tendencies and variation are summarized in this paper in a tabular format. These descriptive statistics, together with the graphs, are all essential parts of this paper because these provide more insight and understanding of the results of the correlation and regression analyses.

Pairwise Correlation Analysis

In order to determine whether the three variables are associated with each other, pairwise correlation analysis was performed. More specifically, correlations were performed to see whether there is an association between the sales of vinyl records and audio cassettes; vinyl records and compact discs; as well as audio cassettes and compact discs.

Correlation analysis is performed in order to quantify, in terms of a correlation coefficient, the association between two continuous variables. There are various formulas to find the correlation coefficient, but this study used the formula to find the Pearson Product Moment correlation coefficient, which is given as:

r=nxy-xxynx2-x2ny2-y2The formula for the Pearson r coefficient above will provide insight into the strength and direction of the relationship between each of the two pairwise comparisons of the different audio formats. The Pearson r may range between -1 and +1. A negative coefficient may mean that there is an opposite relationship between two variables (McClave et al, 2001). For instance, if one variable increases, the other may decrease and vice versa. A positive correlation, on the other hand, may mean that a low level of one variable is associated with a lower level of the other and vice versa.

Moreover, the magnitude of the correlation coefficient will indicate the strength of the association between the two variables. A coefficient that is closer to 1 may indicate a strong relation, while a coefficient that is lower may indicate either a moderate or weak strength of association, depending on the magnitude. Finally, a correlation coefficient that is close to zero may indicate that there is no linear relationship between the two variables (McClave et al, 2001).

Multiple Linear Regression

Based on a given response ? and independent variables x1, x2, …, xn, the relationship between the independent variables to the response can be modeled in this form:

?=fx1, x2, …, xn+?The response ? is expressed as a function of n independent variables x1, x2, …, xn. The random error ? represents other sources of variability that include measurement error, effects of other variables, and others (McClave et al, 2001).

In this study, multiple linear regression was performed in order to explain the variation in the sales of compact discs based on audio cassette and vinyl records sales. The dependent variable, sales of compact discs, is expressed as a function of two independent variables, namely, the sales of audio cassettes and sales of vinyl records. Based on these variables, a regression equation of the following form was developed using the least squares method:

Y=?0+?1×1+?2×2+?Assumptions of Linear Regression

Prior to performing the least squares method in developing the model, this study first conducted a series of diagnostic tests in order to determine if the data set does not violate the assumptions of a multiple linear regression. The first assumption that was tested in this study was the assumption of linearity. Bivariate scatterplots are presented in this paper to provide insights into the linearity of the data.

This study also tested for the normality of the residuals, which is the difference between the predicted and actual observation. To do this, histograms for the residuals and normal probability plots were produced. Homoscedasticity was also tested in this study. Homoscedasticity, or homogeneity of the variance, is presented by plotting the residuals against the predicted values. It is important to check for homoscedasticity in order to ensure the appropriateness of the multiple regression model as well as the constancy of the variance of the error term (McClave et al, 2001).

Finally, this study checked for multicollinearity between the two independent variables. For the model to be accurate, the sales of audio cassettes and compact discs must not be correlated with each other. By incorporating correlated variables in one model, it would be difficult to understand which of the two independent variables contribute to the variance that is explained in the dependent variable. Multicollinearity is determined in this study by inspecting the Variance Inflation Factor, which compares the inflation of the variances of the estimated regression coefficients to when the independent variables are not linearly related. The ideal VIF value is 1, while VIFs above 10 indicate that multicollinearity may be excessively influencing the least square estimates (Kutner, Nachtsheim, and Neter, 2004).

Model Evaluation

To determine how well the model fits the data, a residuals plot was investigated in this study. The coefficient of determination, R2, was also explored to measure the model’s goodness-of-fit. R2 measures how much of the variation is explained by the model that was developed. Ideally, R2 should be higher for one to say that the model fits the data. Lower values of the R2 indicate that the model explains less of the variability of the model (McClave et al, 2001)

Finally, Analysis of Variance was performed in this study in order to check if the model provides an adequate approximation to the actual system. ANOVA was conducted to test the null hypothesis that all terms in the model are unimportant for predicting the dependent variable against the alternative hypothesis that at least one model term is useful for predicting the sales of compact discs.

Chapter 3. Results and Discussion

Descriptive Statistics

Figure 1 shows the trend in the sales volume of vinyl records, audio cassettes, and compact discs from 1973 until 2017. Based on the figure, it can be observed that during the early part of the 1970s, both vinyl records and audio cassettes have had sales already. While vinyl record sales exhibit a dwindling pattern beginning 1973, audio cassette sales volume reflect a slow increase from the 1970s until the 1980s, and a decline beginning in the 1990s. Due to limitations in the data set, it can be observed that by 1973, vinyl records had its maximum sale of 228.0M units, although this could have been higher in the previous years. This decreased by 10.5% the following year, and again decreased by 19.6% in 1975. In 1976 the number of units of vinyl records sold increased by 15.9%; the volume continued to increase until 1979, but then dropped by 25.9% in 1980. The highest drop that was observed in vinyl record sales volume occurred in 1989, when sales volume decreased by 44.2% from 65.6M to 36.6M.

Figure 1. Sales volume trend of vinyl records, audio cassettes, and compact discs in million units from 1973 until 2017.

Audio cassette sales, on the other hand, slowly increased beginning in 1973. Increasing by less than 6% in 1974 and 1975, the number of units sold jumped by 34.6% and 69.3% in the next two years. Audio cassette sales volume continued to rise until it reached its peak sales in 1988 at 450.1M, which decreased by 0.9% the following year. Figure 1 shows that the decrease in audio cassette sales was not abrupt, just like vinyl record sales. Data shows that audio cassette sales slowly decreased by an average of 9.4% over the next ten years. However, beginning in 1999 until 2008, drastic declines in audio cassette sales are observed. During this period, average decrease was at 48.8%, until no more cassette tapes were recorded sold by the RIAA in 2009.

One striking difference between audio cassette sales volume and vinyl record sales volume as observed in the graph is that while vinyl records continue to be patronized even until today, although by relatively few customers, audio cassettes existed in the music industry for quite a few years only. Even until today, about a century since vinyl records were first introduced in the market, people continue to purchase music in this format. In contrast, audio cassette sales were quick to die down. According to Consumer News and Business Channel, while vinyl sales form a small portion of physical album sales, it continues to outstrip digital downloads, mainly from the younger generation. According to CNBC.com, almost half of vinyl purchasers are between 18 and 24 years old. Moreover, one out of 4 people from this 18-24 demographic said that they had purchased a vinyl record in the past twelve months. Research also shows that more than half of vinyl purchasers only buy used records, while about a third only buys new vinyl records (CNBC.com, 2018).

Moreover, compact discs had only begun to exhibit sales by 1983. According to Sony Music Global, after a series of researches, demonstrations, and tests in the 1970s, manufacturing of compact discs on a large scale began in 1982. It was also during this year that Sony first release an album on CD, which was “52nd Street” by Billy Joel. In 1983, compact discs and players began to hit the US market as well as other parts of the world (Sony Global, 2018). This explains why the earliest recorded sales for compact discs is during 1983. The sales of CDs follow quite a similar pattern as that of audio cassette sales. The difference is that while audio cassette sales volume increased rather slowly from its introduction, CD sales volume exhibited abrupt increases. From the time that CDs were introduced to the market, sales increased by as much as six times from 0.8M to 5.8M. Over the next three years, average sales increase was at 172%–289.1% in 1985, 134.5% in 1986, and 92.6% in 1987. CD sales continued to increase over the next years until it reached its maximum sales of 942.5M in 2000. Beginning in 2001 until 2017, compact disc sales slowly decreased, averaging at 12.8%. Such decrease is considerably lower as compared to the decrease in audio cassette sales.

It is quite interesting to note how these audio formats outsell each other over the years. Figure 1 shows that from 1973 until 1980, vinyl records outsold audio cassettes. During this period, it can be observed that the slope of the line for the vinyl records sales volume is almost horizontal, which means that the decline during this period was rather small. Data shows that at this period, the average decrease in the number of vinyl records sold was only at around 3.8%. In contrast, the slope of the audio cassette sales volume, although it started at an almost zero slope, drastically became steep beginning in 1976. Since then, more and more cassette tapes were sold, and the average increase was at 41.6%. While audio cassettes were initially outsold by vinyl records in the 1970s, its rate of increase was greater that it was able to outsell the latter in the 1980s. Since then until the early 2000s, audio cassettes outsold vinyl records, which were not able to recover as it continued to decrease in sales volume. This is true even when the former began to decrease in sales volume.

With the introduction of compact discs in the market, almost the same pattern can be observed. Upon its introduction in 1983, CDs were outsold by both cassettes and vinyls. However, because of the decrease in vinyl records, CDs eventually outsold the former in just a matter of five years. On the other hand, it wasn’t that fast before CDs could outnumber cassette tapes, for it took nine years before this could happen. Beginning in 1983, cassette tapes were already experiencing a slow increase in its sales volume, averaging at only 6.5%, while CDs were exhibiting an average increase of 160.2%. Because the increase of CD sales volume sped up during this time frame, it eventually outsold cassettes in 1992 until 2017.

There are various differences in the overall sales of the three different audio formats as shown in Table 1. This table shows that the average sales of vinyl records from 1973 to 2017 is only at 58.8M. But then again, due to limitations in the data set, the average sales could be higher if previous years were included in the data set. What this average says is that during this period when new audio formats are introduced, old formats like vinyl records and audio cassettes were no longer much patronized by the consumers.

Table 1 also shows that the average number of audio cassettes sold over the past 36 years that it was in the market is at 172.1M units. The average number of compact discs sold is much higher, with average annual sales of 420.1M units in 35 years. Moreover, the maximum number of compact discs sold is 942.5M, which is more than twice that of the maximum sales volume of audio cassettes. With its portability, CDs brought much more convenience to consumers, which is one of the reasons why it sold considerably more units as compared to vinyl records and cassette tapes. With the invention of the portable disc player, and with its smaller size compared to vinyl records, consumers were able to bring with them these CDs and listen to music wherever they may go.

Table 1. Descriptive statistics of the sales volume of vinyl records, audio cassettes, and compact discs from 1973 until 2017.

Statistic Vinyl Records Audio Cassettes Compact Discs

Mean 58.8 172.1 420.1

Median 10.2 123.8 333.3

Min 0.3 0.1 0.8

Max 228.0 450.1 942.5

Range 227.7 450.0 941.7

Standard Deviation 75.5 155.8 306.7

Variance 5,699.9 24,264.9 94,081.8

Pairwise Correlation Analysis

Results of the pairwise correlation analysis show that there is a moderate association between the number of compact discs and audio cassettes sold, as well as a strong association between compact discs and audio cassettes. However, there is a weak association between audio cassettes and vinyl records. Table 2 summarizes the Pearson Product Moment Correlation Coefficient, which quantifies the association between the variables in this study. As shown on the table, the correlation coefficient between audio cassettes and vinyl records is at -0.2. This indicates a weak correlation between these two variables, which means that the increase or decrease in the sales volume of audio cassettes is not necessarily associated with a corresponding decrease or increase in the sales volume of vinyl records. As shown on the scatter plot on Figure 2, while there is no linear relationship between these two variables, there seems to be a nonlinear quadratic relationship between these two variables.

There is, however, a negative strong linear relationship between compact discs and audio cassettes, which is at -0.7. This simply means that while the sales volume of compact disc increases, there is a strong likelihood that the sales volume of the audio cassettes will decrease. Conversely, if fewer compact discs are sold, this corresponds to a higher likelihood that more audio cassettes units are sold.

Table 2. Pearson Product Moment Correlation Coefficient of the sales volume of vinyl records, audio cassettes, and compact discs from 1973 until 2017.

Vinyl Records Audio Cassettes Compact Discs

Vinyl Records 1.0 -0.2 -0.5

Audio Cassettes -0.2 1.0 -0.7

Compact Discs -0.5 -0.7 1.0

Figure 2. Scatter plots of the sales volume of vinyl records, audio cassettes, and compact discs from 1973 until 2017.

Finally, there is a negative moderate linear relationship between vinyl records and compact discs. The Pearson Product Moment correlation coefficient for these two variables is at -0.5. A negative value means an inverse relationship, while the magnitude of 0.5 implies a moderate relationship. Hence, the correlation coefficient of -0.5 means that an increase in the number of vinyl records that are sold brings with it a moderate likelihood that the sales volume of compact discs will decrease. Conversely, a decrease in the sales volume of compact discs implies a moderate likelihood that the sales volume of vinyl records will increase.

Multiple Linear Regression

Assumptions of Linear Regression

Prior to building the model, it is important to verify that the data satisfies the assumptions of a multiple linear regression. As shown on Figure 2, there is a linear relationship between each of the independent variables to the response, hence satisfying the linearity assumption. Moreover, Figure 3 below shows that the plotted internally studentized residuals follow a nearly straight line, which indicates that these follow a normal distribution. Hence, the data set satisfies the assumption on the normality of residuals.

Figure 3. Normal probability plot

Figure 4. Residuals versus predicted values plot

Table 3. Variance Inflation Factor of the independent variables.

Variance Inflation Factor

Vinyl Records 4.5

Audio Cassettes 1.3

The internally studentized residuals were also used in assessing the constancy of the error variance. Figure 4 is a plot of the internally studentized residuals versus the predicted response values in ascending order. This figure shows a random scatter of the residuals across the graph, thus indicating that the assumption on the constancy of the error variance is satisfied. Finally, it was also important to check for the multicollinearity between the two independent variables, namely, sales volume of the audio cassettes and vinyl records. The ideal VIF value is 1, while VIFs above 10 indicate that multicollinearity may be excessively inflating the least squares estimates. Table 3 shows that the VIF for audio cassette is 1.3, while VIF for vinyl records is 4.5. Both VIF values indicate that there is no collinearity between the two independent variables that would heavily influence the results of the regression and lead to inaccurate interpretations of the regression.

Model Building

All four important assumptions of the multiple linear regression have been satisfied. Hence, it is appropriate to perform regression to build the model, which is given as follows:

CompactDisc=778.08-5.20*VinylRecord-0.52*AudioCassetteIn the above model, the variables CompactDisc, VinylRecord, as well as AudioCassette refer to the sales volume of compact discs, vinyl records, and audio cassettes, respectively. Based on the model, it can be said that on the average, without the presence of vinyl records and audio cassettes, the sales volume of compact discs is at 778 units. However, this figure is bound to decrease by 5.20 units per unit of vinyl records sold, assuming that audio cassette volume remains constant. Moreover, this average also decreases by 0.52 per unit of audio cassette sold, assuming that vinyl records sales are kept constant.

The model shows that the coefficients of the independent variables are both negative. Moreover, the result of the correlation analysis supports the linear regression. As previously mentioned, there is a negative linear relationship between compact discs and cassette sales volume as well as compact discs and records sales volume. The results of the regression further show that this relationship is not merely correlational, but rather, causal as well. This means that audio cassettes and vinyl records both have negative influences on the sales volume of compact disc—that is, on the average, compact disc decreases with every sale of the other two audio formats.

Based on the magnitude of the correlation coefficient as well as the regression coefficient, it can be deduced that vinyl records have a stronger influence on the sales of compact discs. In fact, the impact of vinyl records sales on compact discs is ten times more than the effect of audio cassettes on CDs. Such is evident on the first figure, wherein it can be observed that with the decline in the sales of the records, the number of compact discs that were sold continued to rise. The rise, however, halted at a certain point in 2000 and then declined all the way to 87.6 million units. At this stage, it is clear that other factors affect the sales volume of compact discs, but such is not included in the scope of this study.

Cassette tapes, on the other hand, did not have such a great negative influence on compact discs. Although cassette does have a negative coefficient, the magnitude is only at 0.52, which is many times lower than vinyl’s coefficient of 5.2. This explains why the historical sales shows that while cassette sales volume was rising from 1980 until the early 1990s, sales volume of compact discs also continued to rise.

A lot of factors could explain why compact discs outsold audio cassettes and vinyl records, and the first of which is convenience. CDs were much more convenient than vinyl or cassettes because the former does not require flipping and is portable. Music lovers can listen to the songs they want anywhere they go with the invention of the portable CD player. Vinyl records, on the other hand, which are much larger in size, does not possess this portability element. However, while Sony created the Walkman and hence allowed people to take their songs with them via the cassette tapes, people eventually preferred CDs because these were more compact and easier to maneuver while searching through a list of tracks. In addition, CDs were also more resilient to weathering and damage as compared to audio cassettes and vinyl records, thus making it more acclaimed among consumers (Berkeley.edu, n.d.).

Because of the aforementioned factors, people gradually shifted from cassettes to compact discs. Eventually, with the increasing demand for CDs, they became more affordable and easily accessible as compared to the cassettes. Moreover, CDs were also easier to produce, hence the lower price. CDs also produced better sound quality than audio cassettes, which also explains why people preferred the former over the latter (Berkeley). Based on the RIAA data, audio cassette sales have ended last 2008.

As of 2017, there were only 87.6 million units of compact discs sold in the United States. This is a 90% decrease from 17 years ago, when CD sales volume reached its peak to 942.5 million units in 2000. It is only logical to deduce that the existence of other audio formats has caused the decline in compact discs beginning in the early 2000s. While the model that was developed in this study does take into account the effects of audio cassettes and vinyl records, there are many other factors that should be considered. For instance, the introduction of the iPod, MP3 players, digital downloads, and other streaming services have led to the decline of the compact discs. Data from the RIAA (2018) shows that with the decline of CDs in 2001 was the introduction of the iPod. This may have caused many people to shift from using CD players to iPod, thus causing a slash in disc sales. In 2007, first quarter sales of compact discs plunged by 20% from the previous year, which, according to Smith (2007), is the latest sign of the shift in how consumers acquire music. In that same year, Apple sold around 100 million iPods in the United States alone (Smith, 2007).

According to Smith (2007), the sales of 100 million iPods shows that, despite the decline of compact discs, music continues to play a powerful and significant force in the lives of many people. With the invention of the internet, consumers are provided with more ways of obtaining music as compared to years ago, when the only option they had was to walk into a store and physically buy music on a particular audio format. Nowadays, people can just download music, either legally or illegally, from the internet and save it to their laptops and phones. In addition to this, the presence of streaming apps such as Spotify has also made music more accessible to the people, thus causing the shift in the preference of the consumers.

Because there is less demand for CDs, manufacturers have therefore produced less units, which makes CDs less accessible now to consumers. In addition, it is now difficult to find CD players, as these have already been replaced with smart-TVs and other equipment. While homes used to have CD players before, it is now to uncommon for these to be actually used. Hence, the decline of the CDs did not only affect its manufacturers, but also other industries. It is therefore important that the music industry preserves CDs before it becomes obsolete. One way of preserving CDs is by creating innovative marketing schemes in order to entice people to patronize it more. Manufacturers may also look into the buying behavior of the target and/or potential market and make strategies that are targeted towards these people.

Model Evaluation

Figure 5. Residuals plot

Prior to building the model, this paper first conducted a series of diagnostic tests to check if the data satisfies the assumptions for a multiple linear regression. It is important to perform these tests in order to ensure that the results of the regression are accurate and reliable. In addition to the assumption-checking, it is also important to conduct a series of model evaluation techniques so as to see how the model could explain the data. Evaluating the model is also important in order to assess the predictive capability of the model.

A residuals plot, as shown on Figure 5, shows scattered points around the horizontal axis. This indicates that the linear regression model that has been built is appropriate for the data. If a different pattern is observed on Figure 5, such as a U-shaped graph or inverted U-shaped graph, this would indicate that the linear regression model is not appropriate and hence there may be a need to reconstruct a new model.

Figure 6 shows the line fit plot, which simply plots the actual versus predicted values. The actual values shown on the graph, which are represented by the blue dots, are from the RIAA data. On the other hand, the predicted values, represented by the red dots, are the computed sales volume using the statistical model that was developed in this study. Figure 6 shows that for both vinyl records and audio cassette line fit plots, the red and blue dots are very near each other. For the vinyl records line fit plot, it can be observed that the trend that the blue dots follow is also almost the same trend that the red dots follow. The same is true for the audio cassettes line fit plot. These indicate that the predicted values based on the statistical model that has been developed in this study are very near or almost equal the actual RIAA values. This means that the statistical model has a very good predictive capability. It can predict new values of the number of units of compact discs sold based on the number of vinyl records and audio cassettes sold.

The coefficient of determination, R2, supports the findings stated above. The computed R2 is 0.76, which means that 76% of the variations of the compact disc sales volume can be explained by the audio cassettes and vinyl records sales volumes. Moreover, a value that is near 1 indicates that the statistical model fits well the data.

Finally, an analysis of variance was also conducted in this study to test the hypothesis that all terms in the model are useful for predicting the compact disc sales volume. The null and alternative hypotheses are as follows:

H0: ?0=?1=?2=0 (All model terms are unimportant for predicting compact discs sales volume)

Ha: At least one ?i?0(At least one model term is useful for predicting compact disc sales volume)

Figure 6. Line fit plot

Table 4. Analysis of Variance

Degrees of Freedom Sum of Squares Mean Square F Statistic p-value

Regression 2 1,939,862.0 969,931.0 37.20 <0.00

Residual 23 599,750.4 26,076.1 Total 25 2,539,612.0 As shown on Table 4, the p-value is very small, hence the null hypothesis may be rejected. This leads to the decision that at least one model term is useful for predicting the sales volume of compact disc based on the sales volumes of vinyl records and audio cassettes. The analysis of variance is important in order to determine if the model provides an adequate approximation to the actual system. Based on the results of the ANOVA, it can be concluded that the model is useful and significant.

Chapter 4. Summary and Conclusions

This study aimed to assess the association of the sales volumes of compact discs, vinyl records, and audio cassettes. Results show that there is a strong negative linear relationship between compact disc and audio cassette, with a correlation coefficient of -0.7. Moreover, there is a moderate negative linear relationship between compact discs and vinyl records, with correlation coefficient of -0.5, and a weal negative linear relationship between audio cassettes and compact discs, with correlation coefficient of -0.1.

The second objective of this study is to develop a statistical model of compact disc sales volume as a function of two other audio formats, namely, vinyl records and audio cassettes. Prior to building the model, the assumptions of multiple linear regression were first assessed. Results of a series of assessments show that the data satisfies all four assumptions: linearity, normality of residuals, non-existence of multicollinearity, as well as constancy of variance. After it has been shown that the data does not violate an assumption, this study used least squares estimation in developing the model, which is as follows:

CompactDisc=778.08-5.20*VinylRecord-0.52*AudioCassetteA series of model evaluation tests were conducted in this study to ensure that the model is appropriate. A residuals plot shows that the linear regression model is appropriate for the data. In addition, line fit plots were also generated, which shows that the statistical model has a good predictive capability. In addition, it has been shown that the model fits the actual data well with an R-squared which is equal to 0.76. Finally, an analysis of variance was conducted and results show that the model is useful and significant.

While this study did follow the necessary steps in building a statistical model, it does have its limitations. First, the data set used in building the model only included 26 observations. This is because there is no data on the number of audio cassettes sold beginning in 2009 until 2017. In addition, because compact discs were introduced in 1983, there is no CD data from 1973 until 1982. This meant that some data points for all three variables were not used in developing the building. These were, however, used in generating descriptive statistics. A similar study may be conducted, and it is recommended to use more data points in order to increase the efficiency of the model.

Second, the model was built solely on two audio formats—vinyl records and audio cassettes. While the study shows that these variables explain a large portion of the variation in the sales volume of compact discs, it would also be interesting to explore other independent variables. A similar study may be conducted that will explore the relationships between compact discs and other audio formats, specifically 8-track, streaming apps, downloaded music, and others.

