Exercise 4: Logistic Regressionand Newton's Method
2011-10-16 12:43
597 查看
Rawpage:http://openclassroom.stanford.edu/MainFolder/DocumentPage.php?course=MachineLearning&doc=exercises/ex4/ex4.html
Exercise4:LogisticRegressionandNewton'sMethod
Inthisexercise,youwilluseNewton'sMethodtoimplementlogisticregressiononaclassificationproblem.
Data
Tobegin,download
ex4Data.zipandextractthefilesfromthezipfile.
Forthisexercise,supposethatahighschoolhasadatasetrepresenting40studentswhowereadmittedtocollegeand40studentswhowerenotadmitted.Eachtrainingexamplecontainsastudent'sscoreontwostandardizedexamsandalabelofwhetherthestudent
wasadmitted.
Yourtaskistobuildabinaryclassificationmodelthatestimatescollegeadmissionchancesbasedonastudent'sscoresontwoexams.Inyourtrainingdata,
a.ThefirstcolumnofyourxarrayrepresentsallTest1scores,andthesecondcolumnrepresentsallTest2scores.
b.Theyvectoruses'1'tolabelastudentwhowasadmittedand'0'tolabelastudentwhowasnotadmitted.
Plotthedata
Loadthedataforthetrainingexamplesintoyourprogramandaddtheintercepttermintoyourxmatrix.
BeforebeginningNewton'sMethod,wewillfirstplotthedatausingdifferentsymbolstorepresentthetwoclasses.InMatlab/Octave,youcanseparatethepositiveclassandthenegativeclassusingthefindcommand:
Yourplotshouldlooklikethefollowing:
Newton'sMethod
Recallthatinlogisticregression,thehypothesisfunctionis
Inourexample,thehypothesisisinterpretedastheprobabilitythatadriverwillbeaccident-free,giventhevaluesofthefeaturesinx.
Matlab/Octavedoesnothavealibraryfunctionforthesigmoid,soyouwillhavetodefineityourself.Theeasiestwaytodothisisthroughaninlineexpression:
Thecostfunctionisdefinedas
OurgoalistouseNewton'smethodtominimizethisfunction.RecallthattheupdateruleforNewton'smethodis
Inlogisticregression,thegradientandtheHessianare
Notethattheformulaspresentedabovearethevectorizedversions.Specifically,thismeansthat,,whileandarescalars.
Implementation
Now,implementNewton'sMethodinyourprogram,startingwiththeinitialvalueof.Todeterminehowmanyiterationstouse,calculateforeachiterationandplotyourresultsasyoudidinExercise2.Asmentionedinthelecturevideos,Newton'smethodoften
convergesin5-15iterations.Ifyoufindyourselfusingfarmoreiterations,youshouldcheckforerrorsinyourimplementation.
Afterconvergence,useyourvaluesofthetatofindthedecisionboundaryintheclassificationproblem.Thedecisionboundaryisdefinedasthelinewhere
whichcorrespondsto
Plottingthedecisionboundaryisequivalenttoplottingtheline.Whenyouarefinished,yourplotshouldappearlikethefigurebelow.
Questions
Finally,recordyouranswerstothesequestions.
1.Whatvaluesofdidyouget?Howmanyiterationswererequiredforconvergence?
2.Whatistheprobabilitythatastudentwithascoreof20onExam1andascoreof80onExam2willnotbeadmitted?
Solutions
Afteryouhavecompletedtheexercisesabove,pleaserefertothesolutionsbelowandcheckthatyourimplementationandyouranswersarecorrect.Inacasewhereyourimplementationdoesnotresultinthesameparameters/phenomenaasdescribedbelow,debug
yoursolutionuntilyoumanagetoreplicatethesameeffectasourimplementation.
Acompletem-fileimplementationofthesolutionscanbefound
here.
Newton'sMethod
1.Yourfinalvaluesofthetashouldbe
Plot.Yourplotofthecostfunctionshouldlooksimilartothepicturebelow:
Fromthisplot,youcaninferthatNewton'sMethodhasconvergedbyaround5iterations.Infact,bylookingataprintoutofthevaluesofJ,youwillseethatJchangesbylessthanbetweenthe4thand5thiterations.Recallthatintheprevioustwoexercises,
gradientdescenttookhundredsoreventhousandsofiterationstoconverge.Newton'sMethodismuchfasterincomparison.
2.Theprobabilitythatastudentwithascoreof20onExam1and80onExam2willnotbeadmittedtocollegeis0.668.
Exercise4:LogisticRegressionandNewton'sMethod
Inthisexercise,youwilluseNewton'sMethodtoimplementlogisticregressiononaclassificationproblem.
Data
Tobegin,download
ex4Data.zipandextractthefilesfromthezipfile.
Forthisexercise,supposethatahighschoolhasadatasetrepresenting40studentswhowereadmittedtocollegeand40studentswhowerenotadmitted.Eachtrainingexamplecontainsastudent'sscoreontwostandardizedexamsandalabelofwhetherthestudent
wasadmitted.
Yourtaskistobuildabinaryclassificationmodelthatestimatescollegeadmissionchancesbasedonastudent'sscoresontwoexams.Inyourtrainingdata,
a.ThefirstcolumnofyourxarrayrepresentsallTest1scores,andthesecondcolumnrepresentsallTest2scores.
b.Theyvectoruses'1'tolabelastudentwhowasadmittedand'0'tolabelastudentwhowasnotadmitted.
Plotthedata
Loadthedataforthetrainingexamplesintoyourprogramandaddtheintercepttermintoyourxmatrix.
BeforebeginningNewton'sMethod,wewillfirstplotthedatausingdifferentsymbolstorepresentthetwoclasses.InMatlab/Octave,youcanseparatethepositiveclassandthenegativeclassusingthefindcommand:
%findreturnstheindicesofthe
%rowsmeetingthespecifiedcondition
pos=find(y==1);neg=find(y==0);
%Assumethefeaturesareinthe2ndand3rd
%columnsofx
plot(x(pos,2),x(pos,3),'+');holdon
plot(x(neg,2),x(neg,3),'o')
Yourplotshouldlooklikethefollowing:
Newton'sMethod
Recallthatinlogisticregression,thehypothesisfunctionis
Matlab/Octavedoesnothavealibraryfunctionforthesigmoid,soyouwillhavetodefineityourself.Theeasiestwaytodothisisthroughaninlineexpression:
g=inline('1.0./(1.0+exp(-z))');
%Usage:Tofindthevalueofthesigmoid
%evaluatedat2,callg(2)
Thecostfunctionisdefinedas
Implementation
Now,implementNewton'sMethodinyourprogram,startingwiththeinitialvalueof.Todeterminehowmanyiterationstouse,calculateforeachiterationandplotyourresultsasyoudidinExercise2.Asmentionedinthelecturevideos,Newton'smethodoften
convergesin5-15iterations.Ifyoufindyourselfusingfarmoreiterations,youshouldcheckforerrorsinyourimplementation.
Afterconvergence,useyourvaluesofthetatofindthedecisionboundaryintheclassificationproblem.Thedecisionboundaryisdefinedasthelinewhere
whichcorrespondsto
Plottingthedecisionboundaryisequivalenttoplottingtheline.Whenyouarefinished,yourplotshouldappearlikethefigurebelow.
Questions
Finally,recordyouranswerstothesequestions.
1.Whatvaluesofdidyouget?Howmanyiterationswererequiredforconvergence?
2.Whatistheprobabilitythatastudentwithascoreof20onExam1andascoreof80onExam2willnotbeadmitted?
Solutions
Afteryouhavecompletedtheexercisesabove,pleaserefertothesolutionsbelowandcheckthatyourimplementationandyouranswersarecorrect.Inacasewhereyourimplementationdoesnotresultinthesameparameters/phenomenaasdescribedbelow,debug
yoursolutionuntilyoumanagetoreplicatethesameeffectasourimplementation.
Acompletem-fileimplementationofthesolutionscanbefound
here.
Newton'sMethod
1.Yourfinalvaluesofthetashouldbe
Fromthisplot,youcaninferthatNewton'sMethodhasconvergedbyaround5iterations.Infact,bylookingataprintoutofthevaluesofJ,youwillseethatJchangesbylessthanbetweenthe4thand5thiterations.Recallthatintheprevioustwoexercises,
gradientdescenttookhundredsoreventhousandsofiterationstoconverge.Newton'sMethodismuchfasterincomparison.
2.Theprobabilitythatastudentwithascoreof20onExam1and80onExam2willnotbeadmittedtocollegeis0.668.
相关文章推荐
- Exercise: Logistic Regression and Newton's Method
- 牛顿迭代法(Newton's Method)
- [机器学习实验3]Logistic Regression and Newton Method
- Scala-2 - 5 - Lecture 1.5 - Example_ square roots with Newton_'s method (11_25)
- 牛顿迭代法(Newton's Method)
- 牛顿法求平分根(newton's method)
- 牛顿迭代法(Newton's Method)
- 牛顿迭代法(Newton's Method)
- 牛顿迭代法(Newton's Method)
- 逻辑回归和牛顿法 Logistic Regression and Newton's Method
- 牛顿迭代法(Newton's Method)
- fatal error C1900: Il mismatch between 'P1' version '20080116' and 'P2' version '20070207'
- .Net 's delegates and events
- Stanford机器学习---第三讲. 逻辑回归和过拟合问题的解决 logistic Regression & Regularization
- QT4中构建多线程的服务器and QT'Socket
- 解决 cannot specify both '-fobjc-arc' and '-fobjc-gc-only' 错误
- HHC4003: Warning:The following option line does not contain an '=' character separating the option and its value: 锘縖OPTIONS]
- UVA - 10061 How many zero's and how many digits ?
- 开发中的问题:adb.exe' and can be executed
- POJ 2492 A Bug's Life and POJ 1703 Find them, Catch them