<?xml version="1.0" encoding="utf-8"?> <!-- Generator: Adobe Illustrator 28.6.0, SVG Export Plug-In . SVG Version: 9.03 Build 54939) --> <svg version="1.1" id="Layer_1" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" x="0px" y="0px" viewBox="0 0 1300.13 1301.39" style="enable-background:new 0 0 1300.13 1301.39;" xml:space="preserve"> <style type="text/css"> .st0{fill:currentColor;} </style> <g> <g> <path class="st0" d="M1055.06,1100.7h-810c-24.85,0-45-20.15-45-45v-810c0-24.85,20.15-45,45-45s45,20.15,45,45v765h765 c24.85,0,45,20.15,45,45S1079.91,1100.7,1055.06,1100.7z"/> </g> <g> <path class="st0" d="M740.06,920.7c-74.44,0-135-60.56-135-135s60.56-135,135-135c74.44,0,135,60.56,135,135 S814.5,920.7,740.06,920.7z M740.06,740.7c-24.81,0-45,20.19-45,45s20.19,45,45,45s45-20.19,45-45S764.88,740.7,740.06,740.7z M515.06,650.7c-74.44,0-135-60.56-135-135s60.56-135,135-135s135,60.56,135,135S589.5,650.7,515.06,650.7z M515.06,470.7 c-24.81,0-45,20.19-45,45s20.19,45,45,45s45-20.19,45-45S539.88,470.7,515.06,470.7z M965.06,560.7c-74.44,0-135-60.56-135-135 s60.56-135,135-135s135,60.56,135,135S1039.5,560.7,965.06,560.7z M965.06,380.7c-24.81,0-45,20.19-45,45s20.19,45,45,45 s45-20.19,45-45S989.88,380.7,965.06,380.7z"/> </g> </g> </svg>
Scatterplots
Practice Problems
Answer Key

Table of Contents

[fs-toc-h2]What is a scatterplot?

Scatter plots are pieces of individual real world data put onto a graph, where the x-axis represents one unit of measure, and the y-axis represents another. For example, data from a survey could be collected on the correlation between hours studied a week and gpa. This data could then be plotted onto a graph where the x-axis is hours studied, and the y-axis is GPA. 

These graphs usually have a linear model placed onto them, known as a line of best fit. This line is used to predict unknown values that were not given in the data.

linear lines of best fit have four different possibilities: 

Take a look at the example below. 

[fs-toc-h4]Example 1

A strong negative association means all the data points are relatively close to each other with a negative slope (downwards trend)

While choice A does follow a downwards trend, all the data points are quite scattered

Choice B has a strange arrangement of data that's almost in a horizontal line, meaning its association is NOT negative

Choice C data points are relatively close, meaning there is a strong association, but it follows an upward trend/positive slope.

That leaves us with Choice D, which has both a strong association between data points and a downwards trend/negative slope, making that the right answer. 

[fs-toc-h2]Slope and expected values

On the SAT, many questions will ask you to find the slope of the line of best fit. For linear lines, this process is no different than finding the slope of any other regular linear graph, as we went over in the linear equations lesson. Take a look at the question below.

NOTE: Not all lines of best fit will be straight. On the test, some may take the form of a parabola (quadratic). 

[fs-toc-h4]Example 1

The easiest way to find the slope here is choosing two random points of the line and using the slope equation. 

Be careful with trying to count the rise over run here. Notice that the scale the y-axis is in is much different than the x-axis: Each step the y-axis goes up by 100, while the x-axis goes up by 2. 

Find two points on the line that are closest to intersecting an exact point on the line. The slope we find doesn't have to be perfect, it just has to be a close enough approximation to match it up with the answer choices. The two points we’ll use to find the slope will be (16, 600) and (22, 800).

If you remember from the linear equations lesson, always label each number with either /(x_1, x_2, y_1, \text{ or } y_2/) to ensure you don't mess up. In this case I'll set

\[16 = x_1\]

\[600 = y_1\]

\[22 = x_2\]

\[800 = y_2\]

Plug these numbers into the slope equation to give a rough approximation of the slope of the line

\[\text{slope} = \frac{y_2 - y_1}{x_2 - x_1} = \frac{800 - 600}{22 - 16} = \frac{200}{6} \approx 33.3\]

These slopes match up the closest to choice C and D, meaning you can eliminate choices A and B.

To find the y-intercept (denoted by the constant b, write out everything we currently know into a linear equation \((y = mx + b)\)

\[d=33t+b\]

You can now choose a point to plug in for \(d\) and \(t\) (\(d\) corresponds to \(y\) while \(t\) corresponds to \(x\), in this case i'll use (22, 800)

\[d=33t+b\]

 

\[800=33(22)+b\]

\[800=726+b\]

\[74 = b\]

The value of 74 is much closer to 84 than it is to 300, making choice D the right answer.

NOTE: this question CAN be solved using the desmos regression feature by finding the approximate \(y= mx +b\) equation using two points. Click here and see the linear relationships section for more information.

[fs-toc-h4]Example 2

Since the x-axis corresponds to width, and the measurement we’re given is also in width, plug in the number for the x-value within the equation

\[y=1.67x+21.1\]

\[y=1.67(19)+21.1\]

\[y= 52.83\]

Making choice C the correct answer. 

[fs-toc-h4]Example 3

For the swim that took him 34 minutes, after scanning up the graph, you can see that his heart rate was 148 bpm - at point (34, 148)

When the line of best fit goes through the 34 minute mark, the y-value of that point is 150- at point (34, 150)

The difference between 150 and 148 is 2, making choice B the correct answer.

[fs-toc-h4]Example 4

Firstly find where -2 is (approx.) on the x axis. This would be a little before the hashmark in between -5 and 0. You can now scan up this line until you reach the line of best fit, which is at approximately 70%, meaning choice C is the correct answer.

Additional Resources

Khan Academy Scatterplot Example 1

Khan Academy Scatterplot Example 2