<h2>Intuitive Curve Fitting</h2>
<p>Earlier, we used various means, such as histograms, frequency polygons and ogives, to visualise our data. These are very useful tools to depict <strong>univariate</strong> data, i.e. data with only one variable such as the height of learners in a class.</p>
<p>Last year we also learnt about a visual tool called scatter plots. <strong>Scatter plots</strong> are a common way to visualise <strong>bivariate</strong> data, i.e. data with two variables. This allows us to identify the <em>direction</em> and <em>strength</em> of a relationship between two variables.</p>
<p>We identify the nature of a relationship between two variables by examining if the points on the scatter plot conform to a linear, exponential, quadratic or some other function. The process of fitting functions to data is known as <strong>curve fitting</strong>.</p>
<p>The strength of a relationship can be described as <em>strong</em> if the data points conform closely to a function or <em>weak</em> if they are further away.</p>
<p>In the case of linear functions, the direction of a relationship is <em>positive</em> if high values of one variable occur with high values of the other or <em>negative</em> if high values of one variable occur with low values of the other.</p>
<p>The table below summarises the different relationships:</p>
<table>
<tbody>
<tr>
<td><img alt="d8bd90ae5b2e003c50b248fb41fc8479.png" src="https://data.ulearngo.com/assets/5dfbb39a-e39f-4680-abf3-68357201fb0b"/></td>
<td><img alt="1fe4dc6e7b0275fbb0c7fb289ea58e7c.png" src="https://data.ulearngo.com/assets/7ec94a36-75b8-49c0-abaf-a4c3b823de8e"/></td>
</tr>
<tr>
<td>Strong, positive linear relationship</td>
<td>Strong, negative linear relationship</td>
</tr>
<tr>
<td><img alt="e43b5b86ca3d4c237a2a038dc88a4113.png" src="https://data.ulearngo.com/assets/6b049986-6fbe-44a3-adb3-16129303980f"/></td>
<td><img alt="3bf2a920bc3fdab3d0a0292f17872ecb.png" src="https://data.ulearngo.com/assets/ddb1e702-2a62-4a3d-b684-4918b9a66f13"/></td>
</tr>
<tr>
<td>Weak, positive linear relationship</td>
<td>Exponential relationship</td>
</tr>
<tr>
<td><img alt="4519f3fd72c93dcbda5439be5095b8c9.png" src="https://data.ulearngo.com/assets/a92ea544-3ca8-4d46-86b2-22dba57da016"/></td>
<td><img alt="198f27a7b888515870a7a359177df27f.png" src="https://data.ulearngo.com/assets/aaaa67f6-ae42-4ef1-8908-6a3c126ec7f1"/></td>
</tr>
<tr>
<td>Quadratic relationship</td>
<td>No relationship</td>
</tr>
</tbody>
</table>
<div class="ns-school-emp">
<h2>Example</h2>
<div class="ns-school-defn">
<h3>Question</h3>
<div>
<p>Examine the scatter plot below of data collected from a new shop:</p>
<img alt="6e11101e62a1c7bac9665f8b4462f4e5.png" src="https://data.ulearngo.com/assets/4b92fd68-ae3a-4574-b32b-a644e2743a68"/>
<ul>
<li>What are the two variables being compared?</li>
<li>What type of function best fits the data?</li>
<li>Is the relationship between the two variables strong or weak?</li>
<li>Is the relationship between the two variables positive or negative?</li>
<li>Using your answers above, describe the relationship between the two variables in one sentence.</li>
</ul>
</div>
<ul>
<li>The variables being compared are average daily number of customers and time in months.</li>
<li>The data fit an exponential function.</li>
<li>The data points appear to fit the curve close to perfectly, so the relationship can be described as very strong.</li>
<li>As time increases, the number of customers increases, so the relationship can be described as positive.</li>
<li>There is a very strong, positive, exponential relationship between average daily customers and time in the new shop.</li>
</ul>
</div>
</div>
<p>In the worked example above, by plotting the average daily customers and time data of a new shop on a scatter plot, we were able to identify the relationship between the two variables. Once we know the relationship between two variables, we are able to do another very useful thing - we are able to predict values where no data exist.</p>
<div class="ns-school-emp">
<h3>Definition: Interpolation and extrapolation</h3>
<div class="ns-school-defn">
<p>When we predict values that fall within the range of our data, this is known as <strong>interpolation</strong>. When we predict the values of a variable beyond the range of our data, this is known as <strong>extrapolation</strong>.</p>
</div>
</div>
<p>Extrapolation must be done with caution unless it is known that the observed relationship continues beyond the range of our data. For example, an exponential function may look linear if we only have the first few data points available but if we extrapolate far enough beyond the initial data points, our predictions will be inaccurate.</p>
<p>In order to interpolate or extrapolate values, we need to find the equation of the function which best fits the data. For linear data, we draw a straight line through the data which best approximates the available data points. This line is known as the <strong>line of best fit</strong> or trend line. Let us try our hand at this in the following example.</p>
<div class="ns-school-emp">
<h2>Example</h2>
<div class="ns-school-defn">
<h3>Question</h3>
<div>
<ul>
<li>Use the data below to draw a scatter plot and line of best fit.</li>
<li>Write down the equation of the line that best seems to fit the data.</li>
<li>Use your equation to calculate the estimated value for \(y\) if \(x = 4\).</li>
<li>Use your equation to calculate the estimated value for \(x\) if \(y = 6\).</li>
</ul>
<table>
<tbody>
<tr>
<td>
<p>\(x\)</p>
</td>
<td>
<p>\(\text{1.0}\)</p>
</td>
<td>
<p>\(\text{2.4}\)</p>
</td>
<td>
<p>\(\text{3.1}\)</p>
</td>
<td>
<p>\(\text{4.9}\)</p>
</td>
<td>
<p>\(\text{5.6}\)</p>
</td>
<td>
<p>\(\text{6.2}\)</p>
</td>
</tr>
<tr>
<td>
<p>\(y\)</p>
</td>
<td>
<p>\(\text{2.5}\)</p>
</td>
<td>
<p>\(\text{2.8}\)</p>
</td>
<td>
<p>\(\text{3.0}\)</p>
</td>
<td>
<p>\(\text{4.8}\)</p>
</td>
<td>
<p>\(\text{5.1}\)</p>
</td>
<td>
<p>\(\text{5.3}\)</p>
</td>
</tr>
</tbody>
</table>
</div>
<div>
<h3>Draw the graph</h3>
<ol>
<li>Choose a suitable scale for the axes.</li>
<li>Draw the axes.</li>
<li>Plot the points.</li>
</ol>
<img alt="994c3314a825c323de7e3513d6ee4b8e.png" src="https://data.ulearngo.com/assets/55f9ffa1-e8f9-4cdd-8f97-405319284ec2"/></div>
<div>
<h3>Drawing the line of best fit</h3>
<p>The next step is to draw a straight line which goes as close to as many points as possible. It is generally best to have as many points above the line as below the line.</p>
<img alt="2f465ddbebbec8a7e4676fb11f055fc8.png" src="https://data.ulearngo.com/assets/879d8fe5-80b1-4b71-bada-5b8922839991"/></div>
<div>
<h3>Calculating the equation of the line</h3>
<p>The equation of the line is</p>

\(y=mx+c\)

<p>From the graph we have drawn, we estimate the y-intercept to be \(\text{1.5}\). We estimate that \(y=\text{3.5}\) when \(x=3\). So we have that points \((3;\text{3.5})\) and \((0;\text{1.5})\) lie on the line. The gradient of the line, m, is given by</p>

\begin{align*} m &amp; = \cfrac{\Delta y}{\Delta x} = \cfrac{{y}_{2}-{y}_{1}}{{x}_{2}-{x}_{1}} \\ &amp; = \cfrac{\text{3.5}-\text{1.5}}{3-0} \\ &amp; = \cfrac{2}{3} \end{align*}

<p>So we finally have that the equation of the line of best fit is</p>

\(y=\cfrac{2}{3}x+\text{1.5}\)

<div>
<h3>Calculate the unknown values</h3>
<p>The equation of the line is \(y=\cfrac{2}{3}x+\text{1.5}\) so in order to find the unknown values, we insert the known values into our equation.</p>
<p>For \(x = 4\):</p>

\begin{align*} y &amp;=\cfrac{2}{3} \cdot 4 +\text{1.5}\\ &amp;= \text{4.17} \end{align*}

<p>Since this \(x\)-value is within the data range, this is <strong>interpolation</strong>.</p>
<p>For \(y = 6\):</p>

\begin{align*} 6 &amp; =\cfrac{2}{3} \cdot x +\text{1.5} \\ \therefore x &amp;= (6 - \text{1.5}) \times \cfrac{3}{2} \\ &amp;= \text{6.75} \end{align*}

<p>Since this \(y\)-value is outside the data range, this is <strong>extrapolation</strong>.</p>
</div>
</div>
</div>
</div>
<p>In the previous worked example, we drew the line of best fit by hand. This can give us a reasonable approximation of which function best fits the data when the data points are close together. However, you may have found that you obtained slightly different answers from one another. In the next section, we will learn about a more precise way of fitting a linear function to data.</p>

77e14597-4d58-4399-be3f-b1bb5c28bc54

Intuitive Curve Fitting

Mathematics studies measurement, relationships, and properties of quantities and sets, using numbers and symbols. Arithmetic, algebra, geometry, and calculus are popular topics in mathematics.

Mathematics

Explore various statistical concepts, such as data collection, measures of central tendency and dispersion, histograms, and curve fitting techniques like linear regression, to better interpret and analyze data with accuracy and precision.


Intuitive Curve Fitting

Track Your Learning Progress