Hướng dẫn dùng np polyfit python

Question

Hãy để tôi bắt đầu bằng cách nói rằng những gì tôi nhận được có thể không phải là những gì tôi mong đợi và có lẽ bạn có thể giúp tôi ở đây. Tôi có dữ liệu sau:

>>> x
array([ 3.08,  3.1 ,  3.12,  3.14,  3.16,  3.18,  3.2 ,  3.22,  3.24,
    3.26,  3.28,  3.3 ,  3.32,  3.34,  3.36,  3.38,  3.4 ,  3.42,
    3.44,  3.46,  3.48,  3.5 ,  3.52,  3.54,  3.56,  3.58,  3.6 ,
    3.62,  3.64,  3.66,  3.68])

>>> y
array([ 0.000857,  0.001182,  0.001619,  0.002113,  0.002702,  0.003351,
    0.004062,  0.004754,  0.00546 ,  0.006183,  0.006816,  0.007362,
    0.007844,  0.008207,  0.008474,  0.008541,  0.008539,  0.008445,
    0.008251,  0.007974,  0.007608,  0.007193,  0.006752,  0.006269,
    0.005799,  0.005302,  0.004822,  0.004339,  0.00391 ,  0.003481,
    0.003095])

Bây giờ, tôi muốn kết hợp những dữ liệu này với, chẳng hạn, một đa thức bậc 4. Vì vậy tôi làm:

>>> coefs = np.polynomial.polynomial.polyfit(x, y, 4)
>>> ffit = np.poly1d(coefs)

Bây giờ tôi tạo một lưới mới cho các giá trị x để đánh giá hàm phù hợp ffit:

>>> x_new = np.linspace(x[0], x[-1], num=len(x)*10)

Khi tôi thực hiện tất cả việc vẽ biểu đồ (tập dữ liệu và đường cong phù hợp) bằng lệnh:

>>> fig1 = plt.figure()                                                                                           
>>> ax1 = fig1.add_subplot(111)                                                                                   
>>> ax1.scatter(x, y, facecolors='None')                                                                     
>>> ax1.plot(x_new, ffit(x_new))                                                                     
>>> plt.show()

Tôi nhận được những điều sau:

fit_data.png

Những gì tôi mong đợi là chức năng điều chỉnh để khớp chính xác (ít nhất là gần giá trị lớn nhất của dữ liệu). Tôi đang làm gì sai?

Cảm ơn trước.

51 hữu ích 0 bình luận 136k xem chia sẻ

Hãy để tôi bắt đầu bằng cách nói rằng những gì tôi nhận được có thể không phải là những gì tôi mong đợi và có lẽ bạn có thể giúp tôi ở đây. Tôi có dữ liệu sau:

>>> x
array([ 3.08,  3.1 ,  3.12,  3.14,  3.16,  3.18,  3.2 ,  3.22,  3.24,
    3.26,  3.28,  3.3 ,  3.32,  3.34,  3.36,  3.38,  3.4 ,  3.42,
    3.44,  3.46,  3.48,  3.5 ,  3.52,  3.54,  3.56,  3.58,  3.6 ,
    3.62,  3.64,  3.66,  3.68])

>>> y
array([ 0.000857,  0.001182,  0.001619,  0.002113,  0.002702,  0.003351,
    0.004062,  0.004754,  0.00546 ,  0.006183,  0.006816,  0.007362,
    0.007844,  0.008207,  0.008474,  0.008541,  0.008539,  0.008445,
    0.008251,  0.007974,  0.007608,  0.007193,  0.006752,  0.006269,
    0.005799,  0.005302,  0.004822,  0.004339,  0.00391 ,  0.003481,
    0.003095])

Bây giờ, tôi muốn kết hợp những dữ liệu này với, chẳng hạn, một đa thức bậc 4. Vì vậy tôi làm:

>>> coefs = np.polynomial.polynomial.polyfit(x, y, 4)
>>> ffit = np.poly1d(coefs)

Bây giờ tôi tạo một lưới mới cho các giá trị x để đánh giá hàm phù hợp ffit:

>>> x_new = np.linspace(x[0], x[-1], num=len(x)*10)

Khi tôi thực hiện tất cả việc vẽ biểu đồ (tập dữ liệu và đường cong phù hợp) bằng lệnh:

>>> fig1 = plt.figure()                                                                                           
>>> ax1 = fig1.add_subplot(111)                                                                                   
>>> ax1.scatter(x, y, facecolors='None')                                                                     
>>> ax1.plot(x_new, ffit(x_new))                                                                     
>>> plt.show()

Tôi nhận được những điều sau:

fit_data.png

Những gì tôi mong đợi là chức năng điều chỉnh để khớp chính xác (ít nhất là gần giá trị lớn nhất của dữ liệu). Tôi đang làm gì sai?

Cảm ơn trước.

51 hữu ích 0 bình luận 136k xem chia sẻ

Hãy để tôi bắt đầu bằng cách nói rằng những gì tôi nhận được có thể không phải là những gì tôi mong đợi và có lẽ bạn có thể giúp tôi ở đây. Tôi có dữ liệu sau:

>>> x
array([ 3.08,  3.1 ,  3.12,  3.14,  3.16,  3.18,  3.2 ,  3.22,  3.24,
    3.26,  3.28,  3.3 ,  3.32,  3.34,  3.36,  3.38,  3.4 ,  3.42,
    3.44,  3.46,  3.48,  3.5 ,  3.52,  3.54,  3.56,  3.58,  3.6 ,
    3.62,  3.64,  3.66,  3.68])

>>> y
array([ 0.000857,  0.001182,  0.001619,  0.002113,  0.002702,  0.003351,
    0.004062,  0.004754,  0.00546 ,  0.006183,  0.006816,  0.007362,
    0.007844,  0.008207,  0.008474,  0.008541,  0.008539,  0.008445,
    0.008251,  0.007974,  0.007608,  0.007193,  0.006752,  0.006269,
    0.005799,  0.005302,  0.004822,  0.004339,  0.00391 ,  0.003481,
    0.003095])

Bây giờ, tôi muốn kết hợp những dữ liệu này với, chẳng hạn, một đa thức bậc 4. Vì vậy tôi làm:

>>> coefs = np.polynomial.polynomial.polyfit(x, y, 4)
>>> ffit = np.poly1d(coefs)

Bây giờ tôi tạo một lưới mới cho các giá trị x để đánh giá hàm phù hợp ffit:

>>> x_new = np.linspace(x[0], x[-1], num=len(x)*10)

Khi tôi thực hiện tất cả việc vẽ biểu đồ (tập dữ liệu và đường cong phù hợp) bằng lệnh:

>>> fig1 = plt.figure()                                                                                           
>>> ax1 = fig1.add_subplot(111)                                                                                   
>>> ax1.scatter(x, y, facecolors='None')                                                                     
>>> ax1.plot(x_new, ffit(x_new))                                                                     
>>> plt.show()

Tôi nhận được những điều sau:

fit_data.png

Những gì tôi mong đợi là chức năng điều chỉnh để khớp chính xác (ít nhất là gần giá trị lớn nhất của dữ liệu). Tôi đang làm gì sai?

Cảm ơn trước.

51 hữu ích 0 bình luận 136k xem chia sẻ

numpy.polyfit(x, y, deg, rcond=None, full=False, w=None, cov=False)[source]#

Least squares polynomial fit.

Note

This forms part of the old polynomial API. Since version 1.4, the new polynomial API defined in numpy.polynomial is preferred. A summary of the differences can be found in the transition guide.

Fit a polynomial p(x) = p[0] * x**deg + ... + p[deg] of degree deg to points (x, y). Returns a vector of coefficients p that minimises the squared error in the order deg, deg-1, … 0.

The Polynomial.fit class method is recommended for new code as it is more stable numerically. See the documentation of the method for more information.

Parametersxarray_like, shape (M,)

x-coordinates of the M sample points (x[i], y[i]).

yarray_like, shape (M,) or (M, K)

y-coordinates of the sample points. Several data sets of sample points sharing the same x-coordinates can be fitted at once by passing in a 2D-array that contains one dataset per column.

degint

Degree of the fitting polynomial

rcondfloat, optional

Relative condition number of the fit. Singular values smaller than this relative to the largest singular value will be ignored. The default value is len(x)*eps, where eps is the relative precision of the float type, about 2e-16 in most cases.

fullbool, optional

Switch determining nature of return value. When it is False (the default) just the coefficients are returned, when True diagnostic information from the singular value decomposition is also returned.

warray_like, shape (M,), optional

Weights. If not None, the weight w[i] applies to the unsquared residual y[i] - y_hat[i] at x[i]. Ideally the weights are chosen so that the errors of the products w[i]*y[i] all have the same variance. When using inverse-variance weighting, use w[i] = 1/sigma(y[i]). The default value is None.

covbool or str, optional

If given and not False, return not just the estimate but also its covariance matrix. By default, the covariance are scaled by chi2/dof, where dof = M - (deg + 1), i.e., the weights are presumed to be unreliable except in a relative sense and everything is scaled such that the reduced chi2 is unity. This scaling is omitted if cov='unscaled', as is relevant for the case that the weights are w = 1/sigma, with sigma known to be a reliable estimate of the uncertainty.

Returnspndarray, shape (deg + 1,) or (deg + 1, K)

Polynomial coefficients, highest power first. If y was 2-D, the coefficients for k-th data set are in p[:,k].

residuals, rank, singular_values, rcond

These values are only returned if full == True

residuals – sum of squared residuals of the least squares fit
rank – the effective rank of the scaled Vandermonde
coefficient matrix
singular_values – singular values of the scaled Vandermonde
coefficient matrix
rcond – value of rcond.

For more details, see numpy.linalg.lstsq.

Vndarray, shape (M,M) or (M,M,K)

Present only if full == False and cov == True. The covariance matrix of the polynomial coefficient estimates. The diagonal of this matrix are the variance estimates for each coefficient. If y is a 2-D array, then the covariance matrix for the k-th data set are in V[:,:,k]

WarnsRankWarning

The rank of the coefficient matrix in the least-squares fit is deficient. The warning is only raised if full == False.

The warnings can be turned off by

>>> import warnings
>>> warnings.simplefilter('ignore', np.RankWarning)

Notes

The solution minimizes the squared error

\[E = \sum_{j=0}^k |p(x_j) - y_j|^2\]

in the equations:

x[0]**n * p[0] + ... + x[0] * p[n-1] + p[n] = y[0]
x[1]**n * p[0] + ... + x[1] * p[n-1] + p[n] = y[1]
...
x[k]**n * p[0] + ... + x[k] * p[n-1] + p[n] = y[k]

The coefficient matrix of the coefficients p is a Vandermonde matrix.

polyfit issues a RankWarning when the least-squares fit is badly conditioned. This implies that the best fit is not well-defined due to numerical error. The results may be improved by lowering the polynomial degree or by replacing x by x - x.mean(). The rcond parameter can also be set to a value smaller than its default, but the resulting fit may be spurious: including contributions from the small singular values can add numerical noise to the result.

Note that fitting polynomial coefficients is inherently badly conditioned when the degree of the polynomial is large or the interval of sample points is badly centered. The quality of the fit should always be checked in these cases. When polynomial fits are not satisfactory, splines may be a good alternative.

References

1

Wikipedia, “Curve fitting”, https://en.wikipedia.org/wiki/Curve_fitting

2

Wikipedia, “Polynomial interpolation”, https://en.wikipedia.org/wiki/Polynomial_interpolation

Examples

>>> import warnings
>>> x = np.array([0.0, 1.0, 2.0, 3.0,  4.0,  5.0])
>>> y = np.array([0.0, 0.8, 0.9, 0.1, -0.8, -1.0])
>>> z = np.polyfit(x, y, 3)
>>> z
array([ 0.08703704, -0.81349206,  1.69312169, -0.03968254]) # may vary

It is convenient to use poly1d objects for dealing with polynomials:

>>> p = np.poly1d(z)
>>> p(0.5)
0.6143849206349179 # may vary
>>> p(3.5)
-0.34732142857143039 # may vary
>>> p(10)
22.579365079365115 # may vary

High-order polynomials may oscillate wildly:

>>> with warnings.catch_warnings():
...     warnings.simplefilter('ignore', np.RankWarning)
...     p30 = np.poly1d(np.polyfit(x, y, 30))
...
>>> p30(4)
-0.80000000000000204 # may vary
>>> p30(5)
-0.99999999999999445 # may vary
>>> p30(4.5)
-0.10547061179440398 # may vary

Illustration:

>>> import matplotlib.pyplot as plt
>>> xp = np.linspace(-2, 6, 100)
>>> _ = plt.plot(x, y, '.', xp, p(xp), '-', xp, p30(xp), '--')
>>> plt.ylim(-2,2)
(-2, 2)
>>> plt.show()