1 Star 1 Fork 0

cl-phy/numerical-analysis

加入 Gitee
与超过 1200万 开发者一起发现、参与优秀开源项目,私有仓库也完全免费 :)
免费加入
克隆/下载
numerical-computations.tm 150.26 KB
一键复制 编辑 原始数据 按行查看 历史
cl-phy 提交于 2024-09-06 00:41 . few
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677787980818283848586878889909192939495969798991001011021031041051061071081091101111121131141151161171181191201211221231241251261271281291301311321331341351361371381391401411421431441451461471481491501511521531541551561571581591601611621631641651661671681691701711721731741751761771781791801811821831841851861871881891901911921931941951961971981992002012022032042052062072082092102112122132142152162172182192202212222232242252262272282292302312322332342352362372382392402412422432442452462472482492502512522532542552562572582592602612622632642652662672682692702712722732742752762772782792802812822832842852862872882892902912922932942952962972982993003013023033043053063073083093103113123133143153163173183193203213223233243253263273283293303313323333343353363373383393403413423433443453463473483493503513523533543553563573583593603613623633643653663673683693703713723733743753763773783793803813823833843853863873883893903913923933943953963973983994004014024034044054064074084094104114124134144154164174184194204214224234244254264274284294304314324334344354364374384394404414424434444454464474484494504514524534544554564574584594604614624634644654664674684694704714724734744754764774784794804814824834844854864874884894904914924934944954964974984995005015025035045055065075085095105115125135145155165175185195205215225235245255265275285295305315325335345355365375385395405415425435445455465475485495505515525535545555565575585595605615625635645655665675685695705715725735745755765775785795805815825835845855865875885895905915925935945955965975985996006016026036046056066076086096106116126136146156166176186196206216226236246256266276286296306316326336346356366376386396406416426436446456466476486496506516526536546556566576586596606616626636646656666676686696706716726736746756766776786796806816826836846856866876886896906916926936946956966976986997007017027037047057067077087097107117127137147157167177187197207217227237247257267277287297307317327337347357367377387397407417427437447457467477487497507517527537547557567577587597607617627637647657667677687697707717727737747757767777787797807817827837847857867877887897907917927937947957967977987998008018028038048058068078088098108118128138148158168178188198208218228238248258268278288298308318328338348358368378388398408418428438448458468478488498508518528538548558568578588598608618628638648658668678688698708718728738748758768778788798808818828838848858868878888898908918928938948958968978988999009019029039049059069079089099109119129139149159169179189199209219229239249259269279289299309319329339349359369379389399409419429439449459469479489499509519529539549559569579589599609619629639649659669679689699709719729739749759769779789799809819829839849859869879889899909919929939949959969979989991000100110021003100410051006100710081009101010111012101310141015101610171018101910201021102210231024102510261027102810291030103110321033103410351036103710381039104010411042104310441045104610471048104910501051105210531054105510561057105810591060106110621063106410651066106710681069107010711072107310741075107610771078107910801081108210831084108510861087108810891090109110921093109410951096109710981099110011011102110311041105110611071108110911101111111211131114111511161117111811191120112111221123112411251126112711281129113011311132113311341135113611371138113911401141114211431144114511461147114811491150115111521153115411551156115711581159116011611162116311641165116611671168116911701171117211731174117511761177117811791180118111821183118411851186118711881189119011911192119311941195119611971198119912001201120212031204120512061207120812091210121112121213121412151216121712181219122012211222122312241225122612271228122912301231123212331234123512361237123812391240124112421243124412451246124712481249125012511252125312541255125612571258125912601261126212631264126512661267126812691270127112721273127412751276127712781279128012811282128312841285128612871288128912901291129212931294129512961297129812991300130113021303130413051306130713081309131013111312131313141315131613171318131913201321132213231324132513261327132813291330133113321333133413351336133713381339134013411342134313441345134613471348134913501351135213531354135513561357135813591360136113621363136413651366136713681369137013711372137313741375137613771378137913801381138213831384138513861387138813891390139113921393139413951396139713981399140014011402140314041405140614071408140914101411141214131414141514161417141814191420142114221423142414251426142714281429143014311432143314341435143614371438143914401441144214431444144514461447144814491450145114521453145414551456145714581459146014611462146314641465146614671468146914701471147214731474147514761477147814791480148114821483148414851486148714881489149014911492149314941495149614971498149915001501150215031504150515061507150815091510151115121513151415151516151715181519152015211522152315241525152615271528152915301531153215331534153515361537153815391540154115421543154415451546154715481549155015511552155315541555155615571558155915601561156215631564156515661567156815691570157115721573157415751576157715781579158015811582158315841585158615871588158915901591159215931594159515961597159815991600160116021603160416051606160716081609161016111612161316141615161616171618161916201621162216231624162516261627162816291630163116321633163416351636163716381639164016411642164316441645164616471648164916501651165216531654165516561657165816591660166116621663166416651666166716681669167016711672167316741675167616771678167916801681168216831684168516861687168816891690169116921693169416951696169716981699170017011702170317041705170617071708170917101711171217131714171517161717171817191720172117221723172417251726172717281729173017311732173317341735173617371738173917401741174217431744174517461747174817491750175117521753175417551756175717581759176017611762176317641765176617671768176917701771177217731774177517761777177817791780178117821783178417851786178717881789179017911792179317941795179617971798179918001801180218031804180518061807180818091810181118121813181418151816181718181819182018211822182318241825182618271828182918301831183218331834183518361837183818391840184118421843184418451846184718481849185018511852185318541855185618571858185918601861186218631864186518661867186818691870187118721873187418751876187718781879188018811882188318841885188618871888188918901891189218931894189518961897189818991900190119021903190419051906190719081909191019111912191319141915191619171918191919201921192219231924192519261927192819291930193119321933193419351936193719381939194019411942194319441945194619471948194919501951195219531954195519561957195819591960196119621963196419651966196719681969197019711972197319741975197619771978197919801981198219831984198519861987198819891990199119921993199419951996199719981999200020012002200320042005200620072008200920102011201220132014201520162017201820192020202120222023202420252026202720282029203020312032203320342035203620372038203920402041204220432044204520462047204820492050205120522053205420552056205720582059206020612062206320642065206620672068206920702071207220732074207520762077207820792080208120822083208420852086208720882089209020912092209320942095209620972098209921002101210221032104210521062107210821092110211121122113211421152116211721182119212021212122212321242125212621272128212921302131213221332134213521362137213821392140214121422143214421452146214721482149215021512152215321542155215621572158215921602161216221632164216521662167216821692170217121722173217421752176217721782179218021812182218321842185218621872188218921902191219221932194219521962197219821992200220122022203220422052206220722082209221022112212221322142215221622172218221922202221222222232224222522262227222822292230223122322233223422352236223722382239224022412242224322442245224622472248224922502251225222532254225522562257225822592260226122622263226422652266226722682269227022712272227322742275227622772278227922802281228222832284228522862287228822892290229122922293229422952296229722982299230023012302230323042305230623072308230923102311231223132314231523162317231823192320232123222323232423252326232723282329233023312332233323342335233623372338233923402341234223432344234523462347234823492350235123522353235423552356235723582359236023612362236323642365236623672368236923702371237223732374237523762377237823792380238123822383238423852386238723882389239023912392239323942395239623972398239924002401240224032404240524062407240824092410241124122413241424152416241724182419242024212422242324242425242624272428242924302431243224332434243524362437243824392440244124422443244424452446244724482449245024512452245324542455245624572458245924602461246224632464246524662467246824692470247124722473247424752476247724782479248024812482248324842485248624872488248924902491249224932494249524962497249824992500250125022503250425052506250725082509251025112512251325142515251625172518251925202521252225232524252525262527252825292530253125322533253425352536253725382539254025412542254325442545254625472548254925502551255225532554255525562557255825592560256125622563256425652566256725682569257025712572257325742575257625772578257925802581258225832584258525862587258825892590259125922593259425952596259725982599260026012602260326042605260626072608260926102611261226132614261526162617261826192620262126222623262426252626262726282629263026312632263326342635263626372638263926402641264226432644264526462647264826492650265126522653265426552656265726582659266026612662266326642665266626672668266926702671267226732674267526762677267826792680268126822683268426852686268726882689269026912692269326942695269626972698269927002701270227032704270527062707270827092710271127122713271427152716271727182719272027212722272327242725272627272728272927302731273227332734273527362737273827392740274127422743274427452746274727482749275027512752275327542755275627572758275927602761276227632764276527662767276827692770277127722773277427752776277727782779278027812782278327842785278627872788278927902791279227932794279527962797279827992800280128022803280428052806280728082809281028112812281328142815281628172818281928202821282228232824282528262827282828292830283128322833283428352836283728382839284028412842284328442845284628472848284928502851285228532854285528562857285828592860286128622863286428652866286728682869287028712872287328742875287628772878287928802881288228832884288528862887288828892890289128922893289428952896289728982899290029012902290329042905290629072908290929102911291229132914291529162917291829192920292129222923292429252926292729282929293029312932293329342935293629372938293929402941294229432944294529462947294829492950295129522953295429552956295729582959296029612962296329642965296629672968296929702971297229732974297529762977297829792980298129822983298429852986298729882989299029912992299329942995
<TeXmacs|2.1.4>
<style|<tuple|generic|old-lengths>>
<\body>
<doc-data|<doc-title|Numerical computations>>
<em|Reference>:\
0. H.Abelson, G.Sussman and J.Sussman, <em|Structure and Interpretation of
Computer Programs>.
1. \<#6797\>\<#6210\>\<#68EE\>, <em|\<#6570\>\<#503C\>\<#8BA1\>\<#7B97\>\<#65B9\>\<#6CD5\>>.
2. R.Burden and J.Faires, <em|Numerical analysis>.
3. \<#8303\>\<#91D1\>\<#71D5\> and \<#8881\>\<#4E9A\>\<#6E58\>,
<em|\<#975E\>\<#7EBF\>\<#6027\>\<#65B9\>\<#7A0B\>\<#7EC4\>\<#6570\>\<#503C\>\<#89E3\>\<#6CD5\>>.
4. \<#9AD8\>\<#7ACB\>, <em|\<#6570\>\<#503C\>\<#6700\>\<#4F18\>\<#5316\>\<#65B9\>\<#6CD5\>>.
5. J.Nocedal and S.Wright, <em|Numerical Optimization>.
6. Yuan and Stoer, <em|A subspace study on conjugate gradient algorithms>,
Math. Mech. 75 (1995) 11, 69-77
<\table-of-contents|toc>
<vspace*|1fn><with|font-series|bold|math-font-series|bold|1<space|2spc>Introduction>
<datoms|<macro|x|<repeat|<arg|x>|<with|font-series|medium|<with|font-size|1|<space|0.2fn>.<space|0.2fn>>>>>|<htab|5mm>>
<no-break><pageref|auto-1><vspace|0.5fn>
<vspace*|1fn><with|font-series|bold|math-font-series|bold|2<space|2spc>Optimization
problems> <datoms|<macro|x|<repeat|<arg|x>|<with|font-series|medium|<with|font-size|1|<space|0.2fn>.<space|0.2fn>>>>>|<htab|5mm>>
<no-break><pageref|auto-2><vspace|0.5fn>
<with|par-left|1tab|2.1<space|2spc>Unconstrained optimization problem
<datoms|<macro|x|<repeat|<arg|x>|<with|font-series|medium|<with|font-size|1|<space|0.2fn>.<space|0.2fn>>>>>|<htab|5mm>>
<no-break><pageref|auto-3>>
<with|par-left|1tab|2.2<space|2spc>Line search approach
<datoms|<macro|x|<repeat|<arg|x>|<with|font-series|medium|<with|font-size|1|<space|0.2fn>.<space|0.2fn>>>>>|<htab|5mm>>
<no-break><pageref|auto-4>>
<with|par-left|2tab|2.2.1<space|2spc>Line search algorithm
<datoms|<macro|x|<repeat|<arg|x>|<with|font-series|medium|<with|font-size|1|<space|0.2fn>.<space|0.2fn>>>>>|<htab|5mm>>
<no-break><pageref|auto-5>>
<with|par-left|2tab|2.2.2<space|2spc>Different methods to descent
direction <datoms|<macro|x|<repeat|<arg|x>|<with|font-series|medium|<with|font-size|1|<space|0.2fn>.<space|0.2fn>>>>>|<htab|5mm>>
<no-break><pageref|auto-7>>
<with|par-left|1tab|2.3<space|2spc>Trust region approach
<datoms|<macro|x|<repeat|<arg|x>|<with|font-series|medium|<with|font-size|1|<space|0.2fn>.<space|0.2fn>>>>>|<htab|5mm>>
<no-break><pageref|auto-9>>
<with|par-left|2tab|2.3.1<space|2spc>Newton, Quasi-Newton, LMF and CG
methods <datoms|<macro|x|<repeat|<arg|x>|<with|font-series|medium|<with|font-size|1|<space|0.2fn>.<space|0.2fn>>>>>|<htab|5mm>>
<no-break><pageref|auto-10>>
<with|par-left|2tab|2.3.2<space|2spc>Subspace method
<datoms|<macro|x|<repeat|<arg|x>|<with|font-series|medium|<with|font-size|1|<space|0.2fn>.<space|0.2fn>>>>>|<htab|5mm>>
<no-break><pageref|auto-11>>
<with|par-left|1tab|2.4<space|2spc>Strict-inequality constrained
optimization problem <datoms|<macro|x|<repeat|<arg|x>|<with|font-series|medium|<with|font-size|1|<space|0.2fn>.<space|0.2fn>>>>>|<htab|5mm>>
<no-break><pageref|auto-13>>
<with|par-left|1tab|2.5<space|2spc>Constrained optimization problem
<datoms|<macro|x|<repeat|<arg|x>|<with|font-series|medium|<with|font-size|1|<space|0.2fn>.<space|0.2fn>>>>>|<htab|5mm>>
<no-break><pageref|auto-14>>
<with|par-left|2tab|2.5.1<space|2spc>penalty function method
<datoms|<macro|x|<repeat|<arg|x>|<with|font-series|medium|<with|font-size|1|<space|0.2fn>.<space|0.2fn>>>>>|<htab|5mm>>
<no-break><pageref|auto-15>>
<with|par-left|2tab|2.5.2<space|2spc>Augmented Lagrange penalty function
method <datoms|<macro|x|<repeat|<arg|x>|<with|font-series|medium|<with|font-size|1|<space|0.2fn>.<space|0.2fn>>>>>|<htab|5mm>>
<no-break><pageref|auto-16>>
<vspace*|1fn><with|font-series|bold|math-font-series|bold|3<space|2spc>Algebraic
equations> <datoms|<macro|x|<repeat|<arg|x>|<with|font-series|medium|<with|font-size|1|<space|0.2fn>.<space|0.2fn>>>>>|<htab|5mm>>
<no-break><pageref|auto-17><vspace|0.5fn>
<with|par-left|1tab|3.1<space|2spc>Single variable eqn
<datoms|<macro|x|<repeat|<arg|x>|<with|font-series|medium|<with|font-size|1|<space|0.2fn>.<space|0.2fn>>>>>|<htab|5mm>>
<no-break><pageref|auto-18>>
<with|par-left|2tab|3.1.1<space|2spc>Bisection method
<datoms|<macro|x|<repeat|<arg|x>|<with|font-series|medium|<with|font-size|1|<space|0.2fn>.<space|0.2fn>>>>>|<htab|5mm>>
<no-break><pageref|auto-19>>
<with|par-left|2tab|3.1.2<space|2spc>Iteration-function method
<datoms|<macro|x|<repeat|<arg|x>|<with|font-series|medium|<with|font-size|1|<space|0.2fn>.<space|0.2fn>>>>>|<htab|5mm>>
<no-break><pageref|auto-20>>
<with|par-left|2tab|3.1.3<space|2spc>Mixed method
<datoms|<macro|x|<repeat|<arg|x>|<with|font-series|medium|<with|font-size|1|<space|0.2fn>.<space|0.2fn>>>>>|<htab|5mm>>
<no-break><pageref|auto-21>>
<with|par-left|2tab|3.1.4<space|2spc>Accelerating iteration: Aitken
method and it's super version <datoms|<macro|x|<repeat|<arg|x>|<with|font-series|medium|<with|font-size|1|<space|0.2fn>.<space|0.2fn>>>>>|<htab|5mm>>
<no-break><pageref|auto-22>>
<with|par-left|1tab|3.2<space|2spc>Linear eqns
<datoms|<macro|x|<repeat|<arg|x>|<with|font-series|medium|<with|font-size|1|<space|0.2fn>.<space|0.2fn>>>>>|<htab|5mm>>
<no-break><pageref|auto-23>>
<with|par-left|2tab|3.2.1<space|2spc>Direct methods
<datoms|<macro|x|<repeat|<arg|x>|<with|font-series|medium|<with|font-size|1|<space|0.2fn>.<space|0.2fn>>>>>|<htab|5mm>>
<no-break><pageref|auto-24>>
<with|par-left|2tab|3.2.2<space|2spc>Iterative methods
<datoms|<macro|x|<repeat|<arg|x>|<with|font-series|medium|<with|font-size|1|<space|0.2fn>.<space|0.2fn>>>>>|<htab|5mm>>
<no-break><pageref|auto-25>>
<with|par-left|2tab|3.2.3<space|2spc>Jacobi method
<datoms|<macro|x|<repeat|<arg|x>|<with|font-series|medium|<with|font-size|1|<space|0.2fn>.<space|0.2fn>>>>>|<htab|5mm>>
<no-break><pageref|auto-26>>
<with|par-left|2tab|3.2.4<space|2spc>Gauss-Siedel method
<datoms|<macro|x|<repeat|<arg|x>|<with|font-series|medium|<with|font-size|1|<space|0.2fn>.<space|0.2fn>>>>>|<htab|5mm>>
<no-break><pageref|auto-27>>
<with|par-left|2tab|3.2.5<space|2spc>Successive-Over-Relaxation(SOR)
method <datoms|<macro|x|<repeat|<arg|x>|<with|font-series|medium|<with|font-size|1|<space|0.2fn>.<space|0.2fn>>>>>|<htab|5mm>>
<no-break><pageref|auto-28>>
<with|par-left|2tab|3.2.6<space|2spc>Aitken method and its super version
<datoms|<macro|x|<repeat|<arg|x>|<with|font-series|medium|<with|font-size|1|<space|0.2fn>.<space|0.2fn>>>>>|<htab|5mm>>
<no-break><pageref|auto-29>>
<with|par-left|2tab|3.2.7<space|2spc>The convergency problem
<datoms|<macro|x|<repeat|<arg|x>|<with|font-series|medium|<with|font-size|1|<space|0.2fn>.<space|0.2fn>>>>>|<htab|5mm>>
<no-break><pageref|auto-30>>
<with|par-left|1tab|3.3<space|2spc>Non-linear eqns
<datoms|<macro|x|<repeat|<arg|x>|<with|font-series|medium|<with|font-size|1|<space|0.2fn>.<space|0.2fn>>>>>|<htab|5mm>>
<no-break><pageref|auto-31>>
<with|par-left|2tab|3.3.1<space|2spc>Newton method
<datoms|<macro|x|<repeat|<arg|x>|<with|font-series|medium|<with|font-size|1|<space|0.2fn>.<space|0.2fn>>>>>|<htab|5mm>>
<no-break><pageref|auto-32>>
<with|par-left|2tab|3.3.2<space|2spc>Newton-Sceant method
<datoms|<macro|x|<repeat|<arg|x>|<with|font-series|medium|<with|font-size|1|<space|0.2fn>.<space|0.2fn>>>>>|<htab|5mm>>
<no-break><pageref|auto-33>>
<with|par-left|2tab|3.3.3<space|2spc>Quasi-Newton method
<datoms|<macro|x|<repeat|<arg|x>|<with|font-series|medium|<with|font-size|1|<space|0.2fn>.<space|0.2fn>>>>>|<htab|5mm>>
<no-break><pageref|auto-34>>
<with|par-left|2tab|3.3.4<space|2spc>Lenvenberg-Marquardt method
<datoms|<macro|x|<repeat|<arg|x>|<with|font-series|medium|<with|font-size|1|<space|0.2fn>.<space|0.2fn>>>>>|<htab|5mm>>
<no-break><pageref|auto-35>>
<vspace*|1fn><with|font-series|bold|math-font-series|bold|4<space|2spc>Ordinary
differential eqns: initial-value problems>
<datoms|<macro|x|<repeat|<arg|x>|<with|font-series|medium|<with|font-size|1|<space|0.2fn>.<space|0.2fn>>>>>|<htab|5mm>>
<no-break><pageref|auto-36><vspace|0.5fn>
<with|par-left|1tab|4.1<space|2spc>One-step methods: Runge-Kutta method
<datoms|<macro|x|<repeat|<arg|x>|<with|font-series|medium|<with|font-size|1|<space|0.2fn>.<space|0.2fn>>>>>|<htab|5mm>>
<no-break><pageref|auto-37>>
<with|par-left|1tab|4.2<space|2spc>Multi-steps methods: Adams methods
<datoms|<macro|x|<repeat|<arg|x>|<with|font-series|medium|<with|font-size|1|<space|0.2fn>.<space|0.2fn>>>>>|<htab|5mm>>
<no-break><pageref|auto-38>>
<vspace*|1fn><with|font-series|bold|math-font-series|bold|5<space|2spc>Monte
Carlo methods> <datoms|<macro|x|<repeat|<arg|x>|<with|font-series|medium|<with|font-size|1|<space|0.2fn>.<space|0.2fn>>>>>|<htab|5mm>>
<no-break><pageref|auto-39><vspace|0.5fn>
</table-of-contents>
<section|Introduction>
When studying some scientific problems, we are often faced with some
practical problems, such as <em|finding the minimum of a function>,
<em|solving (algebraic or differential) equations>, <em|doing
integrations>, which belong to the subject of <em|applied mathematics>.
These problems are often hard to solve analyticaly, thus some numerical
methods are needed. This note is to give an introduction to solving the
problems, and the algrorithms are described by the language
<verbatim|Scheme>(1975), a Lisp(1958) dialect, which is quite appropriate
for describing/addressing these problems.
<section|Optimization problems>
<subsection|Unconstrained optimization problem>
<strong|Problem:> A nonlinear unconstrained optimization problem is to find
<math|x> to
<\equation>
min<around*|{|f<around*|(|x|)><around*|\||x\<in\>\<bbb-R\><rsup|n>|\<nobracket\>>|}><label|optimization-problem>.
</equation>
The method to solve the problem is generally to find an iterative sequence
<math|<around*|{|x<rsub|0>,x<rsub|1>,\<cdots\>,x<rsub|n>,\<cdots\>|}>>
whose limit is the solution: <math|x=lim<rsub|n\<rightarrow\>\<infty\>>x<rsub|n>>,
which means <math|x<rsub|n>> can approximate the solution when <math|n> is
large enough. The iteration relation often takes as\
<\equation>
x<rsub|k+1>=x<rsub|k>+s<rsub|k>d<rsub|k><label|iteration-relation-of-optimization>,
</equation>
where the <strong|direction> <math|d<rsub|k>> satisfing the <em|descent
condition>,
<\equation>
d<rsub|k>\<cdot\>\<nabla\>f<around*|(|x<rsub|k>|)>\<less\>0<label|descent-condtion>,
</equation>
and <strong|step size> <math|s<rsub|k>> is <em|positive>
<math|s<rsub|k>\<gtr\>0>.
There two main strategies to give <math|s<rsub|k>> and <math|d<rsub|k>>:
<strong|line search stragety> and <strong|trust region strategy>.
The line search strategy is first to find a descent direction
<math|d<rsub|k>>, then try to find an appropriate step <math|s<rsub|k>>;
While the trust region strategy is to fix a closed region
<math|\<Delta\><rsub|k>>, and find the best(at least allowed) step
<math|s<rsub|k>> and direction <math|d<rsub|k>> in this region:
<math|x<rsub|k>+s<rsub|k>d<rsub|k>\<in\>\<Delta\><rsub|k>>.
We first give line-search stragety, then consider trust region strategy.
<subsection|Line search approach>
<subsubsection|Line search algorithm>
Since the descent directions varies for different method but the
line-search frameworks are almost the same, we first consider the step size
<math|s<rsub|k>>, given a descent direction <math|d<rsub|k>>.
Although when given direction <math|d<rsub|k>>, to solve <math|s<rsub|k>>
in
<\equation*>
min <around*|{|s<rsub|k>\<in\>\<bbb-R\><rsub|+><around*|\||f<around*|(|x<rsub|k>+s<rsub|k>d<rsub|k>|)>|\<nobracket\>>|}>
</equation*>
is a one-dimentional problem, it is hard to solve <math|s<rsub|k>> exactly,
which means we need introduce some conditions to say when one
<math|s<rsub|k>> is good enough and stop the calculation in this direction.
We give two mainly used conditions: <em|Armijo condition> and <em|curvature
condition>:
<\definition>
[<strong|Armijo condition> and <strong|curvature condition>]
<\equation>
f<around*|(|x<rsub|k>+s d<rsub|k>|)>\<leqslant\>f<around*|(|x<rsub|k>|)>+c<rsub|1>s\<nabla\>f<rsub|k>\<cdot\>d<rsub|k><label|armijo-condition>,
</equation>
<\equation>
<around*|\||\<nabla\>f<around*|(|x<rsub|k>+s
d<rsub|k>|)>\<cdot\>d<rsub|k>|\|>\<leqslant\>c<rsub|2><around*|\||\<nabla\>f<rsub|k>\<cdot\>d<rsub|k>|\|><label|curvature-condition>,
</equation>
where <math|c<rsub|1>,c<rsub|2>> are satisfing
<\equation*>
0\<less\>c<rsub|1>\<less\>c<rsub|2>\<less\>1.
</equation*>
Note that the <strong|strong-Wolfe-condition> is just to combine above
two conditions.
Note: Armijo condition makes that there exists a finite
<math|s<rsub|\<ast\>>> such that <math|s<rsub|\<ast\>>> does not satisfy
Armijo-condition which means we will focus on region
<math|<around*|[|0,s<rsub|\<ast\>>|]>>.
</definition>
The line-search algorithm is also an iterative method, i.e. getting an
increasing seq {<math|s<rsub|0>=0,s<rsub|1>,\<cdots\>,s<rsub|i>,s<rsub|i+1>,\<cdots\>>},
in which some item <math|s<rsub|\<ast\>>> begins to satisfy the Armijo
condition and curvature. We first simplify the problem as one-dimentional
functions problem, i.e., define
<\equation*>
y<around*|(|s|)>\<assign\>f<around*|(|x<rsub|k>+s d<rsub|k>|)>,
</equation*>
and
<\equation*>
y<rprime|'><around*|(|s|)>=\<nabla\>f<around*|(|x<rsub|k>+s
d<rsub|k>|)>\<cdot\>d<rsub|k>.
</equation*>
Thus the conditions are translated into the following conditons
<\eqnarray>
<tformat|<table|<row|<cell|y<around*|(|s|)>>|<cell|\<leqslant\>>|<cell|y<around*|(|0|)>+c<rsub|1>s
y<rprime|'><around*|(|0|)>,>>|<row|<cell|<around*|\||y<rprime|'><around*|(|s|)>|\|>>|<cell|\<leqslant\>>|<cell|c<rsub|2><around*|\||y<rprime|'><around*|(|0|)>|\|>,>>>>
</eqnarray>
with <math|0\<less\>c<rsub|1>\<less\>c<rsub|2>\<less\>1>.
Since we hope <math|s<rsub|i+1>> is better than <math|s<rsub|i>>, we also
require
<\equation*>
y<around*|(|s<rsub|i+1>|)>\<leqslant\>y<around*|(|s<rsub|i>|)>.
</equation*>
Thus we get the final conditions to the iterative seq
<math|<around*|{|s<rsub|0>=0,s<rsub|1>,\<cdots\>|}>>:
(a) Sufficient-Descent-Condition(sdc):
<\equation>
<choice|<tformat|<table|<row|<cell|y<around*|(|s|)>\<leqslant\>y<around*|(|0|)>+c<rsub|1>s
y<rprime|'><around*|(|0|)>,>>|<row|<cell|y<around*|(|s|)>\<leqslant\>y<around*|(|s<rsub|i>|)>,>>>>>i=1,2,\<cdots\><label|sdc>.
</equation>
(b) Curvature-Condition(cc):
<\equation>
<around*|\||y<rprime|'><around*|(|s|)>|\|>\<leqslant\>c<rsub|2><around*|\||y<rprime|'><around*|(|0|)>|\|><label|cc>.
</equation>
The picture below shows the meaning of sdc and cc.
<\big-figure|<with|gr-mode|<tuple|edit|math-at>|gr-frame|<tuple|scale|1cm|<tuple|0.5gw|0.5gh>>|gr-geometry|<tuple|geometry|1par|0.6par>|gr-grid|<tuple|empty>|gr-grid-old|<tuple|cartesian|<point|0|0>|1>|gr-edit-grid-aspect|<tuple|<tuple|axes|none>|<tuple|1|none>|<tuple|10|none>>|gr-edit-grid|<tuple|empty>|gr-edit-grid-old|<tuple|cartesian|<point|0|0>|1>|gr-arrow-end|\<gtr\>|gr-auto-crop|true|<graphics||<with|arrow-end|\<gtr\>|<line|<point|-3|2>|<point|6.0|2.0>>>|<with|dash-style|10|line-width|2ln|<spline|<point|-3|2>|<point|6.0|-1.3>>>|<spline|<point|-3|2>|<point|-2.2|0.0>|<point|-1.5|-1.0>|<point|-0.6|-0.5>|<point|1.0|1.4>|<point|2.0|-1.0>|<point|3.0|-2.3>|<point|4.4|-1.0>|<point|5.7|1.3>>|<with|fill-color|dark
grey|dash-style|10|line-width|2ln|<cline|<point|-1.77446|-0.840628>|<point|-0.826795863875585|-0.756010862203049>|<point|-1.05192482762799|-0.937151546190836>|<point|-1.27038935740619|-1.00420979134943>|<point|-1.5|-1.0>>>|<with|fill-color|dark
grey|dash-style|10|line-width|2ln|<cline|<point|2.70399|-2.16489>|<point|3.51634287484896|-2.13733868090028>|<point|3.356779619797|-2.236240796522>|<point|3.00000000000001|-2.29999999999999>>>|<line|<point|-2.6|0.972956>|<point|6.0|1.0>>|<with|opacity|40%|fill-color|dark
cyan|<cline|<point|-2.59461|0.95939>|<point|-0.219387804298274|0.980442194909367>|<point|0.270373419190661|0.80086307963009>|<point|-0.17548847478443|0.0907309316276841>|<point|-0.818612261388542|-0.7478156579638>|<point|-1.26780196252372|-1.003824454608>|<point|-1.77224181188479|-0.843013503415153>|<point|-2.2|-8.88178419700125e-16>>>|<with|opacity|40%|fill-color|dark
cyan|<cline|<point|1.59367|0.315653>|<point|1.99999999999999|-1.00000000000001>|<point|2.28004813924113|-1.61681547554277>|<point|2.66284400190881|-2.13093592030098>|<point|3.2152333499942|-2.28876949486246>|<point|3.68227225593638|-1.98947298064476>|<point|4.06233471347244|-1.51142471701807>|<point|4.54577105779716|-0.766782721192294>>>|<math-at|s<rsub|i>|<point|-3.0|0.936126863544746>>|<math-at|s<rsub|i+1>|<point|6.09705318163778|1.6335659478767>>|<with|arrow-end|\<gtr\>|<line|<point|-3|-2>|<point|-3.0|3.0>>>|<math-at|s<rsub|0>=0|<point|-3.91485315517926|1.90873462098161>>|<math-at|sdc|<point|-1.43341|0.0482372>>|<math-at|sdc|<point|2.82112|-0.988937>>|<math-at|cc|<point|0.662108083079772|-1.36993980685276>>|<with|arrow-end|\<gtr\>|<line|<point|0.544549543590422|-1.10054239978833>|<point|-1.08835679887935|-0.955896852759133>>>|<with|arrow-end|\<gtr\>|<line|<point|1.36061317634608|-1.32760616483662>|<point|3.00000000000001|-2.29999999999999>>>|<math-at|loop|<point|-2.76691692022754|-0.201779335890991>>|<math-at|loop|<point|1.46645|-0.942618>>|<math-at|zoom|<point|3.92030002266349|-1.70602211789526>>|<math-at|zoom|<point|-0.480900251355999|-0.688616219076597>>|<math-at|zoom|<point|0.513940336023284|1.44923270273846>>|<math-at|y<around*|(|s|)>|<point|-3.77073686995634|2.76306389734092>>>>>
Region for <math|s<rsub|i+1>>: sdc(light-green) and cc(deep-green).
</big-figure>
<\scm-code>
;;Sufficient-Descent-Condition
(define (sdc c1 s)
\ \ (lambda (s0 ys ys0 y0 dy0)
\ \ \ \ (and (\<less\>= ys (+ y0 (* c1 s dy0)))
\ \ \ \ \ \ \ \ \ (\<less\>= ys ys0))))
;;Cuvature-Condition
(define (cc c2 s)
\ \ (lambda (dys dy0)
\ \ \ \ (\<less\>= (abs dys) (* c2 (abs dy0)))))
</scm-code>
Now we can give the line-search-algorithm, as follows.
<\equation*>
\ <around*|(|loop s<rsub|0>=0 \ s<rsub|1>|)>:<around*|(|sdc?
s<rsub|1>|)>\<rightarrow\><block|<tformat|<table|<row|<cell|yes:<around*|(|cc?
s<rsub|1>|)>\<rightarrow\><block|<tformat|<table|<row|<cell|yes\<Rightarrow\>s<rsub|1>>>|<row|<cell|no:<around*|(|y<rprime|'><around*|(|s1|)>\<gtr\>0|)>\<rightarrow\><block|<tformat|<table|<row|<cell|yes\<Rightarrow\><around*|(|loop
s<rsub|1> new<rsub|->s|)>>>|<row|<cell|no\<Rightarrow\><around*|(|zoom
s<rsub|1> s<rsub|0>|)>>>>>>>>>>>>>|<row|<cell|no\<Rightarrow\><around*|(|zoom
s<rsub|0> s<rsub|1>|)>>>>>>
</equation*>
<\scm-code>
;;---- Line-Search-Algorithm
(define (line-search-algorithm dy c1 c2)
\ \ ;; require 0\<less\>c1\<less\>c2\<less\>1
\ \ (let loop ([s0 0] [s1 (choose-one-step 0)]);choose-one-step
\ \ \ \ \ \ \ (if (~sdc~ s1)
\ \ \ \ \ \ \ \ \ \ \ (if (~cc~ s1) s1 ;return
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ (if (positive? (dy s1))
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ (zoom s1 s0) ;
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ (loop s1 (choose-one-alpha s1))))
\ \ \ \ \ \ \ \ \ \ \ (zoom s0 s1)))) ;
;1. ~sdc~ and ~cc~ would be defined by above functions sdc,cc under
fixing parameters;\
;2. (choose-one-step s) returns a new step larger than s;
;3. zoom is defined in next algorithm;
;;note: choose-one-step is a problem: new step should not quite large,
but too small would be expensive!
</scm-code>
In above algorithm, the <verbatim|zoom> function on
<math|<around*|(|s<rsub|c>,s<rsub|b>|)>> helps to generating the
<math|s\<equiv\>s<rsub|z>> satisfing strong-Wolfe-condition between
<math|s<rsub|c>> and <math|s<rsub|b>>, with conditions:
<\enumerate-alpha>
<item>There exist an <math|s> between <math|s<rsub|c>> and
<math|s<rsub|b>> satisfing strong-Wolfe-condition;
<item><math|s<rsub|c>> is satisfing sdc but not cc;
<item><math|s<rsub|b>> is satisfing <math|y<rprime|'><around*|(|s<rsub|c>|)><around*|(|s<rsub|b>-s<rsub|c>|)>\<less\>0>.
</enumerate-alpha>
<\note>
Please check that in previous line-search-algorithm, the callings of
<verbatim|zoom> satisfy above three conditions. [ a) should be checked
carefully! ]
</note>
<\scm-code>
;;---- zoom
(define (zoom sc sb)
\ \ (let ([s (try-one-step sc sb)]);try-one-alpha
\ \ \ \ (if (~sdc~ s)
\ \ \ \ \ \ \ \ (if (~cc~ s) s ;return
\ \ \ \ \ \ \ \ \ \ \ \ (zoom s (if (negative? (* (dy s) (- sb sc)))
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ sb
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ sc)))
\ \ \ \ \ \ \ \ (zoom sc s)))
;try-one-step returns an step between sc and sb;
;one can use (/ (+ sa sb) 2) or quadratic function given by yc,yb,dyc.
</scm-code>
<\note>
Please check that all the callings on <verbatim|zoom>, the conditions of
<math|s<rsub|c>,s<rsub|b>> are always satisfied.
[ a) should be checked carefully! ]
</note>
<subsubsection|Different methods to descent direction>
(1) The simplest method is the Steepest Gradient method(<strong|SD>) whose
direction is the descent direction at <math|x<rsub|k>> is the inverse
gradient:
<\equation>
d<rsub|k><rsup|SD>=-grad<around*|(|x<rsub|k>|)>\<equiv\>-g<rsub|k><label|SD>.
</equation>
(2) The second method is the Newton method(<strong|Newton>) which is found
as follows.
Consider when <math|x<rsub|k>> is near the solution <math|x<rsub|\<ast\>>>,
then by Taylor expansion at <math|x<rsub|k>>,
<\equation*>
f<around*|(|x|)>=f<around*|(|x<rsub|k>+\<delta\>|)>=f<rsub|k>+g<rsub|k>\<cdot\>\<delta\>+<frac|1|2>\<delta\><rsup|T>G\<delta\>+o<around*|(|<around*|\<\|\|\>|\<delta\>|\<\|\|\>><rsup|2>|)>,
</equation*>
then the minimun of <math|f<rsub|k>+g<rsub|k>\<cdot\>\<delta\>+<frac|1|2>\<delta\><rsup|T>G\<delta\>>
takes at <math|G\<delta\>=-g<rsub|k>>, which means we can use
<\equation>
d<rsub|k><rsup|Newton>=-G<rsup|-1>g<rsub|k><label|Newton>,
</equation>
and <math|s<rsub|k>=1>.
<\note>
Newton may be failed when <math|G> is not inversible.
</note>
(3) Since the Hessen matrix <math|G=\<partial\><rsub|x<rsup|i>x<rsup|j>>f<around*|(|x<rsub|k>|)>>
is hard to get or is space-expensive, one want to get some approximation to
<math|G> or <math|G<rsup|-1>>. One class of these methods is called
Quasi-Newton method(<strong|Quasi-Newton>).
xxxxx to be completed xxxxx
\;
(4) A new class of methods is called Subspace-method(<strong|Subspace>),
including conjugate-gradient methods(<strong|CG>), whose idea is to reduce
the minimum problem to a lower dimensional problem.
<\note>
Subspace method may be better than Newton or scant-Newton method, because
it can solve reduced Hessian matrix <math|<wide|G|^>> by hand, and in
sometimes full Hessian matrix <math|G> is inversible but reduced one is
not.
</note>
Below we give some methods, which are inspired by Newton method, but only
need two points' information, i.e., previous point and current point:
\ <math|<around*|(|x<rsub|0>,f<rsub|0>,g<rsub|0>\<equiv\>grad<rsub|0>|)>,<around*|(|x<rsub|1>,f<rsub|1>,g<rsub|1>\<equiv\>grad<rsub|1>|)>>,
and gives <math|x<rsub|2>=x<rsub|1>+d<rsub|1>>.
<\example>
[this method needs a new point <math|<wide|x|~>>]
Suppose that we have got <math|x<rsub|0>> and
<math|x<rsub|1>=x<rsub|0>+d<rsub|0>>, try to get
<math|x<rsub|2>=x<rsub|1>+d<rsub|1>>, where
<math|d<rsub|1>\<in\>span<around*|{|-g<rsub|1>,d<rsub|0>|}>>, i.e., let
<math|<wide|g|^><rsub|1>=g<rsub|1>/<around*|\||g<rsub|1>|\|>> and
<math|<wide|d|^><rsub|0>=d<rsub|0>/<around*|\||d<rsub|0>|\|>>, and
consider
<\equation*>
x=x<rsub|1>-<wide|g|^><rsub|1>\<mu\>+<wide|d|^><rsub|0>\<nu\>,
</equation*>
and try to find best <math|\<mu\>,\<nu\>>, which is shown in the
following lemma.
<\lemma*>
(1) Let <math|G> be the Hessian matrix of <math|f<around*|(|x|)>> at
<math|x<rsub|1>>, and assume <math|G> is
<strong|positive-definite<verbatim|>>, then to second order, the
minimun of <math|m<around*|(|s,\<beta\>|)>=f<around*|(|x<rsub|1>+s<around*|(|-<wide|g|^><rsub|1>+\<beta\><wide|d|^><rsub|0>|)>|)>>
takes at
<\eqnarray>
<tformat|<table|<row|<cell|\<beta\>>|<cell|=>|<cell|<frac|<around*|\<langle\>|<wide|d|^><rsub|0>,<wide|g|^><rsub|1>|\<rangle\>>-<wide|d|^><rsub|0>\<cdot\><wide|g|^><rsub|1><around*|\<langle\>|<wide|g|^><rsub|1>|\<rangle\>><rsup|2>|<around*|\<langle\>|<wide|d|^><rsub|0>|\<rangle\>><rsup|2>-<wide|d|^><rsub|0>\<cdot\><wide|g|^><rsub|1><around*|\<langle\>|<wide|d|^><rsub|0>,<wide|g|^><rsub|1>|\<rangle\>>>,>>|<row|<cell|s>|<cell|=>|<cell|<around*|\||g<rsub|1>|\|><frac|1-\<beta\><wide|d|^><rsub|0>\<cdot\><wide|g|^><rsub|1>|<around*|\<langle\>|<wide|g|^><rsub|1>|\<rangle\>><rsup|2>-2\<beta\><around*|\<langle\>|<wide|d|^><rsub|0>,<wide|g|^><rsub|1>|\<rangle\>>+\<beta\><rsup|2><around*|\<langle\>|<wide|d|^><rsub|0>|\<rangle\>><rsup|2>>,>>>>
</eqnarray>
where
<\equation*>
<around*|\<langle\>|\<xi\>,\<eta\>|\<rangle\>>\<assign\>\<xi\><rsup|T>G\<eta\>,<around*|\<langle\>|\<xi\>|\<rangle\>><rsup|2>\<equiv\><around*|\<langle\>|\<xi\>,\<xi\>|\<rangle\>>.
</equation*>
(2) To avoid of the Hessian matrix, we can approximate
<math|<around*|\<langle\>|<wide|d|^><rsub|0>|\<rangle\>><rsup|2>,<around*|\<langle\>|<wide|d|^><rsub|0>,g<rsub|1>|\<rangle\>>,<around*|\<langle\>|<wide|g|^><rsub|1>|\<rangle\>><rsup|2>>
as follows,
<\eqnarray>
<tformat|<table|<row|<cell|<around*|\<langle\>|<wide|d|^><rsub|0>|\<rangle\>><rsup|2>>|<cell|=>|<cell|<frac|<around*|\||g<rsub|1>|\|>|<around*|\||d<rsub|0>|\|>><around*|(|<wide|d|^><rsub|0>\<cdot\><wide|g|^><rsub|1>-<frac|<around*|\||g<rsub|0>|\|>|<around*|\||g<rsub|1>|\|>><wide|d<rsub|0>|^>\<cdot\><wide|g|^><rsub|0>|)>,>>|<row|<cell|<around*|\<langle\>|<wide|d|^><rsub|0>,<wide|g|^><rsub|1>|\<rangle\>>>|<cell|=>|<cell|<frac|<around*|\||g<rsub|1>|\|>|<around*|\||d<rsub|0>|\|>><around*|(|1-<frac|<around*|\||g<rsub|0>|\|>|<around*|\||g<rsub|1>|\|>><wide|g|^><rsub|0>\<cdot\><wide|g|^><rsub|1>|)>,>>|<row|<cell|<around*|\<langle\>|<wide|g|^><rsub|1>|\<rangle\>><rsup|2>>|<cell|=>|<cell|<frac|<around*|\||g<rsub|1>|\|>|d<rsub|min>><around*|(|1-<frac|<around*|\||<wide|g|~>|\|>|<around*|\||g<rsub|1>|\|>><wide|<wide|g|~>|^>\<cdot\><wide|g|^><rsub|1>|)>,<wide|x|~>=x<rsub|1>+<wide|s|~><around*|(|-<wide|g|^><rsub|1>|)>,
<wide|s|~>\<assign\>d<rsub|min>.>>>>
</eqnarray>
</lemma*>
<\big-figure|<with|gr-mode|<tuple|edit|math-at>|gr-frame|<tuple|scale|1cm|<tuple|0.5gw|0.510001gh>>|gr-geometry|<tuple|geometry|1par|0.6par>|gr-grid|<tuple|empty>|gr-grid-old|<tuple|cartesian|<point|0|0>|1>|gr-edit-grid-aspect|<tuple|<tuple|axes|none>|<tuple|1|none>|<tuple|10|none>>|gr-edit-grid|<tuple|empty>|gr-edit-grid-old|<tuple|cartesian|<point|0|0>|1>|gr-line-width|2ln|gr-arrow-end|\<gtr\>|gr-auto-crop|true|<graphics||<line|<point|0.0|2.0>|<point|0.0|-2.0>>|<cspline|<point|-3|0>|<point|0.0|-1.0>|<point|3.0|0.0>|<point|0.0|1.0>>|<line|<point|-5.0|0.0>|<point|4.0|0.0>>|<with|dash-style|10|line-width|0.5ln|<line|<point|1|1.6>|<point|-4.2|0.0>>>|<point|-1.79823|0.730531>|<with|arrow-end|\<gtr\>|line-width|2ln|<line|<point|-1.79823|0.730531>|<point|-1.43833509723508|-0.536000132292631>>>|<point|-1.69155|0.355091>|<point|-0.581097|1.11351>|<math-at|x<rsub|0>|<point|-0.655163|1.41135>>|<math-at|<wide|x|~>|<point|-2.03100608546104|0.226005423997883>>|<with|arrow-end|\<gtr\>|dash-style|11100|line-width|2ln|<line|<point|-1.79823|0.730531>|<point|0.0|0.0>>>|<math-at|<with|font-series|bold|d><rsub|1>|<point|-0.747674038792533|0.303742604245924>>|<with|arrow-end|\<gtr\>|line-width|2ln|<line|<point|-0.581097|1.11351>|<point|-1.79823|0.730531>>>|<math-at|<with|font-series|bold|d><rsub|0>|<point|-1.34566|1.13319>>|<math-at|-<with|font-series|bold|g><rsub|1>|<point|-2.23327540894583|-0.583752074868228>>|<math-at|x<rsub|1>|<point|-2.23467059134806|0.859091811086122>>>>>
Subspace method in line-search algorithm [note that
<math|<frac|\<partial\>f|\<partial\>d<rsub|0>><around*|\||<rsub|x<rsub|1>>|\<nobracket\>>=<with|font-series|bold|d><rsub|0>\<cdot\><with|font-series|bold|g><rsub|1>\<sim\>0>]
</big-figure>
<\proof>
Since <math|f<around*|(|x|)>=f<around*|(|x<rsub|1>-<wide|g|^><rsub|1>\<mu\>+<wide|d|^><rsub|0>\<nu\>|)>=f<rsub|1>+<around*|(|-<wide|g|^><rsub|1>\<mu\>+<wide|d|^><rsub|0>\<nu\>|)><rsup|T>g<rsub|1>+<frac|1|2><around*|(|-<wide|g|^><rsub|1>\<mu\>+<wide|d|^><rsub|0>\<nu\>|)><rsup|T>G<around*|(|-<wide|g|^><rsub|1>\<mu\>+<wide|d|^><rsub|0>\<nu\>|)>+O<around*|(|\<mu\><rsup|3>,\<nu\><rsup|3>|)>>,
we get an approximate-function <math|m<around*|(|\<mu\>,\<nu\>|)>> to
2nd-order:
<\eqnarray>
<tformat|<table|<row|<cell|m<around*|(|\<mu\>,\<nu\>|)>>|<cell|\<assign\>>|<cell|f<rsub|1>+<around*|(|-<wide|g|^><rsub|1>\<mu\>+<wide|d|^><rsub|0>\<nu\>|)><rsup|T>g<rsub|1>+<frac|1|2><around*|(|-<wide|g|^><rsub|1>\<mu\>+<wide|d|^><rsub|0>\<nu\>|)><rsup|T>G<around*|(|-<wide|g|^><rsub|1>\<mu\>+<wide|d|^><rsub|0>\<nu\>|)>>>|<row|<cell|>|<cell|=>|<cell|f<rsub|1>-\<mu\><around*|\||g<rsub|1>|\|>+<wide|d|^><rsub|0>\<cdot\><wide|g|^><rsub|1>\<nu\><around*|\||g<rsub|1>|\|>+<frac|1|2><around*|[|\<mu\>,\<nu\>|]><matrix|<tformat|<table|<row|<cell|<around*|\<langle\>|<wide|g|^><rsub|1>|\<rangle\>><rsup|2>>|<cell|-<around*|\<langle\>|<wide|d|^><rsub|0>,<wide|g|^><rsub|1>|\<rangle\>>>>|<row|<cell|-<around*|\<langle\>|<wide|d|^><rsub|0>,<wide|g|^><rsub|1>|\<rangle\>>>|<cell|<around*|\<langle\>|<wide|d|^><rsub|0>|\<rangle\>><rsup|2>>>>>><around*|[|\<mu\>,\<nu\>|]><rsup|T>>>|<row|<cell|>|<cell|\<equiv\>>|<cell|f<rsub|1>-\<mu\><around*|\||g<rsub|1>|\|>+<wide|d|^><rsub|0>\<cdot\><wide|g|^><rsub|1>\<nu\><around*|\||g<rsub|1>|\|>+<frac|1|2><around*|[|\<mu\>,\<nu\>|]><wide|G|^><around*|[|\<mu\>,\<nu\>|]><rsup|T>,>>>>
</eqnarray>
where
<\equation*>
<wide|G|^>\<assign\><matrix|<tformat|<table|<row|<cell|<around*|\<langle\>|-<wide|g|^><rsub|1>|\<rangle\>><rsup|2>>|<cell|<around*|\<langle\>|<wide|d|^><rsub|0>,-<wide|g|^><rsub|1>|\<rangle\>>>>|<row|<cell|<around*|\<langle\>|<wide|d|^><rsub|0>,-<wide|g|^><rsub|1>|\<rangle\>>>|<cell|<around*|\<langle\>|<wide|d|^><rsub|0>|\<rangle\>><rsup|2>>>>>>.
</equation*>
\;
Since <math|G> is assumed <em|positive-definite>, so is
<math|<wide|G|^>>, (i.e., <math|<around*|\<langle\>|<wide|d|^><rsub|0>|\<rangle\>><rsup|2>\<gtr\>0>
and <math|det <wide|G|^>=<around*|\<langle\>|<wide|g|^><rsub|1>|\<rangle\>><rsup|2><around*|\<langle\>|<wide|d|^><rsub|0>|\<rangle\>><rsup|2>-<around*|\<langle\>|<wide|d|^><rsub|0>,<wide|g|^><rsub|1>|\<rangle\>><rsup|2>\<gtr\>0>,)
one can calculate the minimum of <math|m<around*|(|\<mu\>,\<nu\>|)>>,
which takes at
<\equation*>
<around*|[|\<mu\>,\<nu\>|]><rsup|T>=<around*|\||g<rsub|1>|\|><wide|G|^><rsup|-1><around*|[|1,-<wide|d|^><rsub|0>\<cdot\><wide|g|^><rsub|1>|]><rsup|T>.
</equation*>
Then by <math|d<rsub|1>=-<wide|g|^><rsub|1>\<mu\>+<wide|d|^><rsub|0>\<nu\>=<wide|s|^><around*|(|-<wide|g|^><rsub|1>+<wide|\<beta\>|^><wide|d|^><rsub|0>|)>>,
which gives
<\equation*>
<wide|s|^>\<equiv\>\<mu\>,<wide|\<beta\>|^>\<equiv\><frac|\<nu\>|\<mu\>>.
</equation*>
Note that <math|<wide|\<beta\>|^>> is the wanted result, but
<math|<wide|s|^>> is a little different from the lemma, which is got
from the Taylor expansion of <math|f> in direction
<math|-<wide|g|^><rsub|1>+<wide|\<beta\>|^><wide|d|^><rsub|0>> with
given <math|<wide|\<beta\>|^>>,
<\eqnarray>
<tformat|<table|<row|<cell|>|<cell|>|<cell|f<around*|(|x<rsub|1>+<wide|s|^><around*|(|-<wide|g|^><rsub|1>+<wide|\<beta\>|^><wide|d|^><rsub|0>|)>|)>>>|<row|<cell|>|<cell|=>|<cell|f<rsub|1>+<wide|s|^><around*|(|-<wide|g|^><rsub|1>+\<beta\><wide|d|^><rsub|0>|)>\<cdot\><wide|g|^><rsub|1><around*|\||g<rsub|1>|\|>+<frac|1|2><wide|s|^><rsup|2><around*|\<langle\>|-<wide|g|^><rsub|1>+<wide|\<beta\>|^><wide|d|^><rsub|0>|\<rangle\>><rsup|2>+o<around*|(|<wide|s|^><rsup|2><around*|\<langle\>|<around*|(|-<wide|g|^><rsub|1>+<wide|\<beta\>|^><wide|d|^><rsub|0>|)>|\<rangle\>><rsup|2>|)>.>>>>
</eqnarray>
Since <math|<around*|\<langle\>|-<wide|g|^><rsub|1>+<wide|\<beta\>|^><wide|d|^><rsub|0>|\<rangle\>><rsup|2>\<gtr\>0>,
one can safely get the minimun point at the lemma's <math|<wide|s|^>>.
(2) To get <math|<around*|\<langle\>|d<rsub|0>|\<rangle\>><rsup|2>,<around*|\<langle\>|d<rsub|0>,g<rsub|1>|\<rangle\>>,<around*|\<langle\>|g<rsub|1>|\<rangle\>><rsup|2>>
without Hessian matrix, one can use the Taylor expansion of
<math|g<around*|(|x|)>=\<nabla\>f<around*|(|x|)>> at <math|x<rsub|1>>,\
<\equation*>
g<around*|(|x|)>=g<rsub|1>+G<around*|(|x-x<rsub|1>|)>+o<around*|(|<around*|\<\|\|\>|x-x<rsub|1>|\<\|\|\>>|)>.
</equation*>
By taking <math|x=x<rsub|0>=x<rsub|1>-s<rsub|0>d<rsub|0>>, one get
<\equation*>
s<rsub|0>G d<rsub|0>=g<rsub|1>-g<rsub|0>+o<around*|(|s<rsub|0><around*|\<\|\|\>|d<rsub|0>|\<\|\|\>>|)>,
</equation*>
which gives the expression of <math|<around*|\<langle\>|d<rsub|0>|\<rangle\>><rsup|2>,<around*|\<langle\>|d<rsub|0>,g<rsub|1>|\<rangle\>>>,
thus <math|<around*|\<langle\>|<wide|d|^><rsub|0>|\<rangle\>><rsup|2>,<around*|\<langle\>|<wide|d|^><rsub|0>,<wide|g|^><rsub|1>|\<rangle\>>>.
To get <math|<around*|\<langle\>|g<rsub|1>,g<rsub|1>|\<rangle\>>>, one
need another point near <math|x<rsub|1>>, such as
<math|<wide|x|~>=x<rsub|1>-<wide|s|^>g<rsub|1>>, then
<\equation*>
<wide|s|^>G g<rsub|1>=g<rsub|1>-g<rsub|<wide|s|^>>+o<around*|(|<wide|s|^><around*|\<\|\|\>|g<rsub|1>|\<\|\|\>>|)>,
</equation*>
where <math|g<rsub|<wide|s|^>>=g<around*|(|<wide|x|^>|)>>, which gives
the expression <math|<around*|\<langle\>|g<rsub|1>,g<rsub|1>|\<rangle\>>>
and <math|<around*|\<langle\>|<wide|g|^><rsub|1>|\<rangle\>><rsup|2>>.
</proof>
</example>
<\example>
[This method does not need a new point in Ex.6]
Also given info of <math|x<rsub|k-1>,g<rsub|k-1>> and
<math|x<rsub|k>=x<rsub|k-1>+d<rsub|k-1>,g<rsub|k>>, we want to find next
point <math|x<rsub|k+1>=x<rsub|k>+d<rsub|k>>, where
<math|d<rsub|k>=\<mu\><around*|(|-g<rsub|k>|)>+\<nu\>d<rsub|k-1>>.
By Taylor expansion of <math|g<around*|(|x|)>> at <math|x<rsub|k>>, we
get
<\equation*>
g<around*|(|x<rsub|k>+\<mu\><around*|(|-g<rsub|k>|)>+\<nu\>d<rsub|k-1>|)>\<simeq\>g<rsub|k>+G\<cdot\><around*|(|-\<mu\>g<rsub|k>+\<nu\>d<rsub|k-1>|)>=g<rsub|k>-\<mu\>G\<cdot\>g<rsub|k>+\<nu\><around*|(|-g<rsub|k-1>+g<rsub|k>|)>,
</equation*>
where we have used <math|g<rsub|k-1>=g<around*|(|x<rsub|k>+0g<rsub|k>-d<rsub|k-1>|)>\<simeq\>g<rsub|k>-G\<cdot\>d<rsub|k-1>\<Rightarrow\>G\<cdot\>d<rsub|k-1>=-g<rsub|k-1>+g<rsub|k>>.
We hope <math|g<around*|(|x<rsub|k+1>|)>\<simeq\>0>, which means
<math|g<rsub|k>-\<mu\>G\<cdot\>g<rsub|k>+\<nu\><around*|(|-g<rsub|k-1>+g<rsub|k>|)>=0>.
Note that we can not solve <math|\<mu\>,\<nu\>> from this eqn, since we
do not know <math|G\<cdot\>g<rsub|k>>, but we indeed have some info: by
<math|<around*|\<langle\>|d<rsub|k-1>,\<cdot\>|\<rangle\>>> on this eqn,
we can get
<\equation*>
<around*|\<langle\>|-g<rsub|k-1>+g<rsub|k>,g<rsub|k>|\<rangle\>>\<mu\>-<around*|\<langle\>|-g<rsub|k-1>+g<rsub|k>,d<rsub|k-1>|\<rangle\>>\<nu\>=<around*|\<langle\>|g<rsub|k>,d<rsub|k-1>|\<rangle\>>.
</equation*>
This gives a requirement of <math|<around*|(|\<mu\>,\<nu\>|)>>, and we
need one more requirement.
A more requirement may be as follows: we hope the direction
<math|d<rsub|k>> should be closer to <math|-g<rsub|k>> than to
<math|d<rsub|k-1>>, more precisely, we require
<\equation*>
d<rsub|k>=\<mu\><around*|\<\|\|\>|g<rsub|k>|\<\|\|\>><frac|-g<rsub|k>|<around*|\<\|\|\>|g<rsub|k>|\<\|\|\>>>+\<nu\><around*|\<\|\|\>|d<rsub|k-1>|\<\|\|\>><frac|d<rsub|k-1>|<around*|\<\|\|\>|d<rsub|k-1>|\<\|\|\>>>//K<frac|-g<rsub|k>|<around*|\<\|\|\>|g<rsub|k>|\<\|\|\>>>+<around*|(|\<pm\>1|)><frac|d<rsub|k-1>|<around*|\<\|\|\>|d<rsub|k-1>|\<\|\|\>>>,
</equation*>
i.e.,
<\equation*>
<frac|\<mu\><around*|\<\|\|\>|g<rsub|k>|\<\|\|\>>|K>=<frac|\<nu\><around*|\<\|\|\>|d<rsub|k-1>|\<\|\|\>>|\<pm\>1>,K=const\<gtr\>1
</equation*>
<small|[note: this requirement is relative-requirement, which means if
<math|-g<rsub|k>//d<rsub|k-1>>, it also can be fulfilled.]>
Now we can solve <math|<around*|(|\<mu\>,\<nu\>|)>> as follows,
<\align>
<tformat|<table|<row|<cell|\<mu\>=>|<cell|<frac|<around*|\<langle\>|g<rsub|k>,d<rsub|k-1>|\<rangle\>>|<around*|\<langle\>|-g<rsub|k-1>+g<rsub|k>,g<rsub|k>\<mp\><frac|<around*|\<\|\|\>|g<rsub|k>|\<\|\|\>>|K<around*|\<\|\|\>|d<rsub|k-1>|\<\|\|\>>>d<rsub|k-1>|\<rangle\>>>,>>|<row|<cell|\<nu\>=>|<cell|\<pm\><frac|<around*|\<\|\|\>|g<rsub|k>|\<\|\|\>>|K<around*|\<\|\|\>|d<rsub|k-1>|\<\|\|\>>>\<mu\>.>>>>
</align>
Note that <math|K\<gtr\>1> and may be choosed as 2,3,<text-dots>, and
thus <math|\<mu\>\<gtr\>0> is always true, we choose the signature such
that <math|\<mu\>> is the smaller one.
</example>
<\example>
[newton-like method, this method also does not need a new point]
This method only uses <math|<around*|(|x<rsub|0>,g<rsub|0>|)>> and
<math|<around*|(|x<rsub|1>,g<rsub|1>|)>>. To get next point,
<math|x<rsub|2>=x<rsub|1>+d<rsub|1>>, we need the distance
<math|d<rsub|1>>.
At <math|x<rsub|1>>, we use Taylor expansion of gradient of <math|f>:
<\equation*>
g<around*|(|x<rsub|1>+d|)>\<simeq\>g<rsub|1>+H\<cdot\>d.
</equation*>
Let <math|\<Delta\>x\<assign\>x<rsub|1>-x<rsub|0>,\<Delta\>g\<assign\>g<rsub|1>-g<rsub|0>>,
then by <math|g<around*|(|x<rsub|0>|)>=g<around*|(|x<rsub|1>-\<Delta\>x|)>\<simeq\>g<rsub|1>-H\<cdot\>\<Delta\>x>,
we get
<\equation*>
H\<cdot\>\<Delta\>x=\<Delta\>g.
</equation*>
This is a constraint on Hessian matrix.
On the <math|x<rsub|0>x<rsub|1>>-line, there exists a point,
<math|x<rsub|t>=x<rsub|1>-t\<Delta\>x>, whose function's value is
minimum, i.e., the gradient <math|g<rsub|t>=g<rsub|1>-t\<Delta\>g> is
orthognoal to <math|\<Delta\>x>, this makes
<\equation*>
t=<frac|<around*|\<langle\>|g<rsub|1>,\<Delta\>x|\<rangle\>>|<around*|\<langle\>|\<Delta\>g,\<Delta\>x|\<rangle\>>>.
</equation*>
<small|[note: <math|<around*|\<langle\>|\<Delta\>g,\<Delta\>x|\<rangle\>>=0>
means <math|<around*|\<langle\>|\<Delta\>x<around*|\||H|\|>\<Delta\>x|\<rangle\>>=0>,
which means <math|H> is degenerate, which means we should not use Newton
method, so we here assume <math|<around*|\<langle\>|\<Delta\>g,\<Delta\>x|\<rangle\>>\<neq\>0>.]>
Newton method gives the next point <math|x<rsub|\<ast\>>\<equiv\>x<rsub|t>+d<rsub|\<ast\>>>(the
starting point is above point <math|x<rsub|t>>), with
<\equation*>
H\<cdot\>d<rsub|\<ast\>>=-g<rsub|t>.
</equation*>
By <math|<around*|\<langle\>|\<Delta\>x,\<cdot\>|\<rangle\>>> on above
eqn, we find
<\equation*>
<around*|\<langle\>|d<rsub|\<ast\>>,\<Delta\>g|\<rangle\>>=0.
</equation*>
So the best next point from Newton method sits on the set:
<\equation*>
\<Sigma\>\<assign\><around*|{|x<rsub|t>+\<sigma\>:\<sigma\>\<perp\>\<Delta\>g|}>.
</equation*>
Our work is try to find a point <math|p> whose normal-line(through the
point <math|p> and the tagent-vector is <math|p>'s gradient) intersects
<math|\<Sigma\>>, and let their intersection-point to be the \Pbest
point\Q. So which point <math|p> is the best?
Obviously(see fig below), the best point of <math|p> is the point that
sits on the axes of the ellipses, since its normal-line contain the
center of the ellipses! So how to find such a point?
Obviously, the two points that sit on the axes is splitted by
<math|x<rsub|t>>.\
<\big-figure|<with|gr-mode|<tuple|edit|math-at>|gr-frame|<tuple|scale|1cm|<tuple|0.5gw|0.5gh>>|gr-geometry|<tuple|geometry|1par|0.6par>|gr-grid|<tuple|empty>|gr-grid-old|<tuple|cartesian|<point|0|0>|1>|gr-edit-grid-aspect|<tuple|<tuple|axes|none>|<tuple|1|none>|<tuple|10|none>>|gr-edit-grid|<tuple|empty>|gr-edit-grid-old|<tuple|cartesian|<point|0|0>|1>|gr-auto-crop|true|gr-arrow-end|\|\<gtr\>|gr-line-width|2ln|<graphics||<cspline|<point|-3|0>|<point|0.0|1.4>|<point|3.0|0.0>|<point|0.0|-1.5>>|<with|arrow-end|\<gtr\>|<line|<point|-4|0>|<point|4.0|0.0>>>|<with|arrow-end|\<gtr\>|<line|<point|0|-2>|<point|0.0|3.0>>>|<line|<point|2|3>|<point|-4.0|-1.0>>|<point|-2.5|0>|<point|0|1.66667>|<with|arrow-end|\|\<gtr\>|line-width|2ln|<line|<point|0|1.66667>|<point|0.0|0.914567065919462>>>|<with|arrow-end|\|\<gtr\>|line-width|2ln|<line|<point|-2.5|0>|<point|-1.46386124180286|0.0>>>|<with|point-style|star|<point|0.0154631|0>>|<with|arrow-end|\|\<gtr\>|line-width|2ln|<line|<point|-1.0656|0.956267>|<point|-0.697496361952639|0.450224897473211>>>|<point|-1.0656|0.956267>|<with|arrow-end|\<gtr\>|dash-style|10|<spline|<point|-1.70251|2.32377>|<point|-0.580665431935441|2.36610332054505>|<point|0.233502532121533|1.82233502141436>>>|<math-at|t\<pm\>\<delta\>|<point|-2.46451250165366|2.32376967852891>>|<with|arrow-end|\<gtr\>|dash-style|10|<spline|<point|-2.58208757772192|2.17682563831195>|<point|-2.63384706971822|1.32892909114962>|<point|-2.32315968118072|0.117893545879523>>>|<math-at|x<rsub|0>x<rsub|1><text|-line>|<point|-4.6556918904617|-1.42275763989946>>|<line|<point|-0.882363160183357|1.07842455987776>|<point|-0.760996824976849|0.894728138642678>|<point|-0.932915651267251|0.773862209975949>>|<math-at|-g<rsub|<wide|t|^>>|<point|-0.79233364201614|0.249421219738061>>|<math-at|x<rsub|\<ast\>>|<point|0.0|-0.268826993993889>>|<with|dash-style|wave|<line|<point|-2.2|0.2>|<point|-2.65|-0.1>>>|<with|dash-style|wave|<line|<point|0.4|1.93333>|<point|-0.2|1.53333333333333>>>|<line|<point|-1.75583741235613|1.62220531816378>|<point|2.13885765312872|-1.91265379018389>>|<math-at|\<Sigma\>|<point|2.26585857917714|-1.72215240111126>>|<math-at|x<rsub|t>|<point|-1.13916419243302|1.25130577468191>>|<point|-1.55796|0.628024>|<with|dash-style|10|<line|<point|-2.2003406535256|1.18441592803281>|<point|1.69435441195925|-2.22344225426644>>>|<with|arrow-end|\|\<gtr\>|line-width|2ln|<line|<point|-1.55796|0.628024>|<point|-1.1512016811253|0.266419327182547>>>|<math-at|-g<rsub|\<tau\>>|<point|-1.10263899912756|0.0>>|<math-at|x<rsub|\<tau\>>|<point|-2.09450654848525|0.591744939806853>>>>>
note: \<#77ED\>\<#8F74\>\<#4E0A\>\<#7684\>\<#70B9\>\<#66F4\>\<#597D\>\<#FF01\>
</big-figure>
We assume that <math|<wide|t|^>\<pm\>\<delta\>> are just the two points,
which means we require:
<\equation*>
<around*|\<langle\>|g<rsub|t+\<delta\>>,g<rsub|t-\<delta\>>|\<rangle\>>=0\<Rightarrow\>\<delta\>=<frac|<around*|\<\|\|\>|g<rsub|t>|\<\|\|\>>|<around*|\<\|\|\>|\<Delta\>g|\<\|\|\>>>.
</equation*>
Since we did not get the axes points exactly, above two points'(not sit
on axes exactly) normal-lines intersects to <math|\<Sigma\>> with two
different points, which one is better?
To answer this question, one should know that there exists one point,
<math|x<rsub|\<tau\>>=x<rsub|1>-\<tau\>\<Delta\>x>, on
<math|x<rsub|0>x<rsub|1>>-line, whose normal-line is parallel to
<math|\<Sigma\>>, i.e., <math|g<rsub|\<tau\>>\<perp\>\<Delta\>g>, this
gives
<\equation*>
\<tau\>=<frac|<around*|\<langle\>|g<rsub|1>,\<Delta\>g|\<rangle\>>|<around*|\<\|\|\>|\<Delta\>g|\<\|\|\>><rsup|2>>.
</equation*>
For a point on <math|x<rsub|0>x<rsub|1>>-line, its normal line would
intersect to <math|\<Sigma\>> at far away <math|x<rsub|\<ast\>>>, if it
is near <math|x<rsub|\<tau\>>>, so we should choose the one in
<math|<around*|{|x<rsub|t+\<delta\>>,x<rsub|t-\<delta\>>|}>>, s.t., it is
leave <math|x<rsub|\<tau\>>> away, this is to say: pick
<math|x<rsub|\<xi\>>>, where
<\equation*>
\<xi\>=<choice|<tformat|<table|<row|<cell|t+\<delta\>,>|<cell|t\<gtr\>\<tau\>,>>|<row|<cell|t-\<delta\>,>|<cell|t\<less\>\<tau\>.>>>>>
</equation*>
Now we have a point <math|x<rsub|\<xi\>>>, where <math|\<xi\>> is choosen
as abve, and consider the intersection point of the normal-line and
<math|\<Sigma\>>:
<\equation*>
<around*|{|x<rsub|\<xi\>>+\<lambda\>g<rsub|\<xi\>>:\<lambda\>\<in\>\<bbb-R\>|}>\<cap\><around*|{|x<rsub|t>+\<sigma\>:\<sigma\>\<perp\>\<Delta\>g|}>.
</equation*>
They intersect with <math|\<Sigma\>> at points
<math|x<rsub|\<xi\>>+\<lambda\>g<rsub|\<xi\>>>, with [by
<math|<around*|\<langle\>|\<Delta\>g,\<cdot\>|\<rangle\>>>]
<\equation*>
\<lambda\>=<frac|<around*|\<langle\>|x<rsub|t>-x<rsub|\<xi\>>,\<Delta\>g|\<rangle\>>|<around*|\<langle\>|g<rsub|\<xi\>>,\<Delta\>g|\<rangle\>>>=<frac|<around*|(|\<xi\>-t|)><around*|\<langle\>|\<Delta\>x,\<Delta\>g|\<rangle\>>|<around*|\<langle\>|g<rsub|1>-\<xi\>\<Delta\>g,\<Delta\>g|\<rangle\>>>.
</equation*>
Now we get the \Pbest point\Q:
<\align>
<tformat|<table|<row|<cell|x<rsub|2>\<assign\>>|<cell|x<rsub|\<xi\>>+\<lambda\>g<rsub|\<xi\>>>>|<row|<cell|=>|<cell|x<rsub|1>-\<xi\>\<Delta\>x+\<lambda\><around*|(|g<rsub|1>-\<xi\>\<Delta\>g|)>,>>>>
</align>
or the distance
<\equation*>
d<rsub|1>\<assign\>-\<xi\>\<Delta\>x+\<lambda\><around*|(|g<rsub|1>-\<xi\>\<Delta\>g|)>.
</equation*>
</example>
<\example>
[newton-like method: version-2]
[note: This method is an alternative method similar to above method, but
is shown that not good enough. The reason may be from that one of
<math|t\<pm\>\<delta\>> is not a good choice.]
In above method, we have got <math|<wide|t|^>> and <math|\<delta\>>,
since we assume <math|<wide|t|^>\<pm\>\<delta\>> approximate the points
on axes, which means we assume their intersections to <math|\<Sigma\>>
are the same point and just the <math|x<rsub|\<ast\>>>, why not just find
the intersection point of their normal-lines. From this consideration, we
have the following graph.
<\big-figure|<with|gr-mode|<tuple|edit|math-at>|gr-frame|<tuple|scale|1cm|<tuple|0.5gw|0.799999gh>>|gr-geometry|<tuple|geometry|1par|0.6par>|gr-grid|<tuple|empty>|gr-grid-old|<tuple|cartesian|<point|0|0>|1>|gr-edit-grid-aspect|<tuple|<tuple|axes|none>|<tuple|1|none>|<tuple|10|none>>|gr-edit-grid|<tuple|empty>|gr-edit-grid-old|<tuple|cartesian|<point|0|0>|1>|gr-dash-style|10|gr-auto-crop|true|<graphics||<line|<point|-4.0|-4.0>|<point|1.0|1.0>>|<point|-3|-3>|<point|-1.5|-1.5>|<point|0|0>|<with|arrow-end|\|\<gtr\>|line-width|2ln|<line|<point|-1.5|-1.5>|<point|-1.0|-2.0>>>|<with|arrow-end|\|\<gtr\>|line-width|2ln|<line|<point|0|0>|<point|0.0|-1.0>>>|<with|arrow-end|\|\<gtr\>|line-width|2ln|<line|<point|-3|-3>|<point|-2.0|-3.0>>>|<point|0|-3>|<with|dash-style|10|<line|<point|0|0>|<point|0.0|-3.0>|<point|-3.0|-3.0>>>|<with|dash-style|10|<line|<point|-1.5|-1.5>|<point|0.0|-3.0>>>|<math-at|x<rsub|0>x<rsub|1><text|-line>|<point|-3.7|-4>>|<math-at|<wide|t|^>|<point|-2|-1.3>>|<math-at|<wide|t|^>-\<delta\>|<point|-4.0|-3.0>>|<math-at|<wide|t|^>+\<delta\>|<point|-1.0|0.4>>|<math-at|\<delta\><around*|\<\|\|\>|\<Delta\>x|\<\|\|\>>|<point|-1.7|-0.5>>|<math-at|x<rsub|2>|<point|0.0|-3.5>>>>>
<math|<around*|\<\|\|\>|x<rsub|<wide|t|^>>-x<rsub|2>|\<\|\|\>>=\<delta\><around*|\<\|\|\>|\<Delta\>x|\<\|\|\>>>.
</big-figure>
By the graph, the next point is choosed as
<\equation*>
x<rsub|2>=x<rsub|<wide|t|^>>-\<delta\><around*|\<\|\|\>|\<Delta\>x|\<\|\|\>>g<rsub|<wide|t|^>>/<around*|\<\|\|\>|g<rsub|<wide|t|^>>|\<\|\|\>>=x<rsub|<wide|t|^>>-<frac|<around*|\<\|\|\>|\<Delta\>x|\<\|\|\>>|<around*|\<\|\|\>|\<Delta\>g|\<\|\|\>>>g<rsub|<wide|t|^>>,
</equation*>
i.e. the distance from <math|x<rsub|1>> to <math|x<rsub|2>> is
<\equation*>
d<rsub|1>=-<wide|t|^>\<Delta\>x-<frac|<around*|\<\|\|\>|\<Delta\>x|\<\|\|\>>|<around*|\<\|\|\>|\<Delta\>g|\<\|\|\>>><around*|(|g<rsub|1>-<wide|t|^>\<Delta\>g|)>,with
<wide|t|^>=<frac|<around*|\<langle\>|g<rsub|1>,\<Delta\>x|\<rangle\>>|<around*|\<langle\>|\<Delta\>g,\<Delta\>x|\<rangle\>>>.
</equation*>
By some numerical-experiments, it is found that this method is quite
smaller than previous method.
</example>
<\example>
BB-2 method, and its improvement
(1) BB-2 method.
In example-8, we have got
<\equation*>
H\<cdot\>\<Delta\>x=\<Delta\>g,or,\<Delta\>x=H<rsup|-1>\<cdot\>\<Delta\>g
</equation*>
BB-2 method approximates inverse-Hessien matrix as
<math|H<rsup|-1>\<sim\>\<alpha\>I>, in the meaning that
<math|<around*|\<\|\|\>|\<Delta\>x-\<alpha\>\<Delta\>g|\<\|\|\>><rsup|2>>
is minimal. This gives
<\equation*>
\<alpha\>=<frac|<around*|\<langle\>|\<Delta\>x,\<Delta\>g|\<rangle\>>|<around*|\<langle\>|\<Delta\>g,\<Delta\>g|\<rangle\>>>.
</equation*>
Then we can get by Newton method:
<\equation*>
x<rsub|2>\<assign\>x<rsub|1>-H<rsup|-1>g<rsub|1>=x<rsub|1>-\<alpha\>g<rsub|1>=x<rsub|1>-<frac|<around*|\<langle\>|\<Delta\>x,\<Delta\>g|\<rangle\>>|<around*|\<langle\>|\<Delta\>g,\<Delta\>g|\<rangle\>>>g<rsub|1>,
</equation*>
or the distance from <math|x<rsub|1>> to <math|x<rsub|2>>:
<\equation*>
d<rsub|1>\<assign\>-<frac|<around*|\<langle\>|\<Delta\>x,\<Delta\>g|\<rangle\>>|<around*|\<langle\>|\<Delta\>g,\<Delta\>g|\<rangle\>>>g<rsub|1>.
</equation*>
(2) An improvement to BB-2 method.
We have <math|H<rsup|-1>\<sim\>\<alpha\>I> in BB-2 method, but we start
from an other point on the line <math|x<rsub|0>x<rsub|1>>,
<math|x<rsub|t>=x<rsub|1>-t\<Delta\>x>. This point is choosen s.t.,
<math|g<rsub|t>=g<rsub|1>-t\<Delta\>g> is orthogonal to line
<math|x<rsub|0>x<rsub|1>>, which means
<\equation*>
t=<frac|<around*|\<langle\>|g<rsub|1>,\<Delta\>x|\<rangle\>>|<around*|\<langle\>|\<Delta\>g,\<Delta\>x|\<rangle\>>>.
</equation*>
Thus we have
<\equation*>
x<rsub|2>\<assign\>x<rsub|t>-\<alpha\>g<rsub|t>=x<rsub|1>-t\<Delta\>x-\<alpha\><around*|(|g<rsub|1>-t\<Delta\>g|)>,
</equation*>
or
<\equation*>
d<rsub|1>\<assign\>-t\<Delta\>x-\<alpha\><around*|(|g<rsub|1>-t\<Delta\>g|)>.
</equation*>
Note that it is quite amazing that this improvement is quite good for
high-dim problems.
</example>
<subsection|Trust region approach>
In this subsection, we consider another approach to address optimization
problems, trust region approach, which may save some computations compared
to line search approach.
Assume we have got <math|x<rsub|k>>, to get next approximation point
<math|x<rsub|k+1>>, we first restrict a compact region whose center is
<math|x<rsub|k>> and radius is <math|R<rsub|k>>(depend <math|x<rsub|k>>),
and find <math|x<rsub|k+1>> in this region
<\equation*>
<around*|\<\|\|\>|x<rsub|k+1>-x<rsub|k>|\<\|\|\>>\<leqslant\>R<rsub|k>,
</equation*>
by some methods, like Newton, CG, Subspace, etc.
This approach is reasonable, because in those methods, we get
<math|x<rsub|k+1>> approximated by some approximate-function which is from
Taylor expansion at <math|x<rsub|k>>. This means when the distance
<math|<around*|\<\|\|\>|x<rsub|k+1>-x<rsub|k>|\<\|\|\>>> is too large, the
approximate-function is not a good approximation. So we should resrict in a
small region, defined by <math|R<rsub|k>>, where the approximate-function
is good.
Let's denote the object function <math|f<around*|(|x|)>,x\<in\>\<bbb-R\><rsup|n>>,
and near <math|x<rsub|k>>, we have an approximate-function(model function),
<math|m<rsub|k><around*|(|d|)>> with
<\equation*>
m<rsub|k><around*|(|0|)>=f<around*|(|x<rsub|k>|)>,m<rsub|k><around*|(|d<rsub|k>|)>\<approx\>f<around*|(|x<rsub|k>+d<rsub|k>|)>.
</equation*>
Assume we can solve the approximate problem,
<\equation>
d<rsub|k>\<assign\>min<rsub|<around*|\<\|\|\>|d|\<\|\|\>>\<leqslant\>R<rsub|k>>m<rsub|k><around*|(|d|)><label|solve-distance>,
</equation>
and get next point <math|x<rsub|k+1>\<assign\>x<rsub|k>+d<rsub|k>>.
To know how well approximation of the app-function <math|m<rsub|k>> on
<math|f> is, consider the ration:
<\equation*>
\<rho\><rsub|k>\<assign\><frac|f<around*|(|x<rsub|k>|)>-f<around*|(|x<rsub|k+1>|)>|m<rsub|k><around*|(|0|)>-m<rsub|k><around*|(|d<rsub|k>|)>>.
</equation*>
If <math|\<rho\><rsub|k>\<sim\>1>, then we know the model function is a
good approximation; if <math|\<rho\><rsub|k>\<less\>1>, then the object
funtion get better result; in other cases, model function is not a good
approximation which means one shoud smaller the region in the
next-iteration.
<\scm-code>
;;;;Trust region approach
;; *** assume we have (a), (b), (c):
;;(a) object function
(define (f x) ...)
;;(b) model-function
(define (model-func x) (lambda (d) ...))
;;(c) solve the distance, d, from model-function in region R
(define (solve-func x m R) ...) ;=\<gtr\> d, so x_new = x + d
;; *** trust-region ***
(define (trust-region-approach f model-func x_init R_init Rmin)
\ \ ;;x_init: start point, R_init:start radius, Rmin: R\<gtr\>Rmin
\ \ (let loop ([x0 x_init] [R0 R_init])
\ \ \ \ \ (define m (model-func x0))
\ \ \ \ \ (let* ([d0 (solve-func x0 m R0)] ;;solve-func
\ \ \ \ \ \ \ \ \ \ \ \ [x1 (vector-add x0 d0)])
\ \ \ \ \ \ \ (let ([rho0 (/ (- (f x0) (f x1)) (- (f x0) (m d0)))])
\ \ \ \ \ \ \ \ \ (if (\<less\>= rho0 Rmin) x0 ;;ok
\ \ \ \ \ \ \ \ \ \ \ \ \ (let ([R1 (cond [(\<less\> rho0 0.25) (/ R0 4)]
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ [(and (\<gtr\>
rho0 0.75) (= (vector-norm d0) R0))
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ (* 2 R0)]
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ (else R0))])
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ (loop x1 R1)))))))
;;note: case r \<less\>= Rmin is puzzling: is this the error case?
</scm-code>
Now the key problem is to solve <math|d<rsub|k>> in problem
(<reference|solve-distance>) (i.e., <verbatim|solve-func> in above-code),
in the following subsubsections, we will address this problem.
<subsubsection|Newton, Quasi-Newton, LMF and CG methods>
We can use Newton method to get an model function.
Consider the Taylor expansion,
<\equation*>
f<around*|(|x|)>=f<around*|(|x<rsub|0>|)>+<around*|(|x-x<rsub|0>|)><rsup|T>g+<frac|1|2><around*|(|x-x<rsub|0>|)><rsup|T>G<around*|(|x-x<rsub|0>|)>+o<around*|(|<around*|\<\|\|\>|x-x<rsub|0>|\<\|\|\>><rsup|2>|)>,
</equation*>
where <math|g=grad<around*|(|x<rsub|0>|)>>, <math|G> is the Hessian matrix
of <math|f> at <math|x<rsub|0>>.
This gives a model function,
<\equation*>
m<around*|(|d|)>=f<around*|(|x<rsub|0>|)>+d<rsup|T>g+<frac|1|2>d<rsup|T>G
d.
</equation*>
Under assuming <math|G> is positive definite, <math|m<around*|(|d|)>> has a
minimun value with
<\equation*>
d=-G<rsup|-1>g.
</equation*>
<\note>
(i) The Hessian <math|G> and its inverse are hard to get, one can use
<with|font-series|bold|Quasi-Newton> method instead;
(ii) <math|G> may even not inversable, one can use <math|G+\<nu\>I> with
some not quite large <math|\<nu\>\<gtr\>0> instead, this method is called
<with|font-series|bold|LMF> method(Levenberg-Marquardt-Fletcher);
(iii) One also can use <with|font-series|bold|Congugate-Gradient> method
to get <math|d> from <math|m<around*|(|d|)>> instead of solving
<math|G<rsup|-1>g>.
(iv) Since <math|G<rsup|-1>> or <math|G> is hard to get, except
Quasi-Newton method, subspace method shown below would be most useful.
</note>
<subsubsection|Subspace method>
In the following, we consider the <strong|subspace method>.
Suppose that we have got <math|x<rsub|0>> and
<math|x<rsub|1>=x<rsub|0>+d<rsub|0>>, try to get
<math|x<rsub|2>=x<rsub|1>+d<rsub|1>>, where
<math|d<rsub|1>\<in\>span<around*|{|-g<rsub|1>,d<rsub|0>|}>>, i.e.,
consider
<\equation*>
x=x<rsub|1>-<wide|g|^><rsub|1>\<mu\>+<wide|d|^><rsub|0>\<nu\>,
</equation*>
and try to find best <math|\<mu\>,\<nu\>>, where
<math|<wide|g|^><rsub|1>=g<rsub|1>/<around*|\||g<rsub|1>|\|>> and
<math|<wide|d|^><rsub|0>=d<rsub|0>/<around*|\||d<rsub|0>|\|>>.
Since <math|f<around*|(|x|)>=f<around*|(|x<rsub|1>-<wide|g|^><rsub|1>\<mu\>+<wide|d|^><rsub|0>\<nu\>|)>=f<rsub|1>+<around*|(|-<wide|g|^><rsub|1>\<mu\>+<wide|d|^><rsub|0>\<nu\>|)><rsup|T>g<rsub|1>+<frac|1|2><around*|(|-<wide|g|^><rsub|1>\<mu\>+<wide|d|^><rsub|0>\<nu\>|)><rsup|T>G<around*|(|-<wide|g|^><rsub|1>\<mu\>+<wide|d|^><rsub|0>\<nu\>|)>+O<around*|(|\<mu\><rsup|3>,\<nu\><rsup|3>|)>>,
we get a model function <math|m<around*|(|\<mu\>,\<nu\>|)>> and its
2nd-order part <math|m<rsup|<around*|(|2|)>><around*|(|\<mu\>,\<nu\>|)>>:
<\eqnarray>
<tformat|<table|<row|<cell|m<around*|(|\<mu\>,\<nu\>|)>>|<cell|\<assign\>>|<cell|f<rsub|1>+<around*|(|-<wide|g|^><rsub|1>\<mu\>+<wide|d|^><rsub|0>\<nu\>|)><rsup|T>g<rsub|1>+<frac|1|2><around*|(|-<wide|g|^><rsub|1>\<mu\>+<wide|d|^><rsub|0>\<nu\>|)><rsup|T>G<around*|(|-<wide|g|^><rsub|1>\<mu\>+<wide|d|^><rsub|0>\<nu\>|)>>>|<row|<cell|>|<cell|=>|<cell|f<rsub|1>-\<mu\><around*|\||g<rsub|1>|\|>+<wide|d|^><rsub|0>\<cdot\><wide|g|^><rsub|1>\<nu\><around*|\||g<rsub|1>|\|>+<frac|1|2><around*|[|\<mu\>,\<nu\>|]><matrix|<tformat|<table|<row|<cell|<around*|\<langle\>|<wide|g|^><rsub|1>|\<rangle\>><rsup|2>>|<cell|-<around*|\<langle\>|<wide|d|^><rsub|0>,<wide|g|^><rsub|1>|\<rangle\>>>>|<row|<cell|-<around*|\<langle\>|<wide|d|^><rsub|0>,<wide|g|^><rsub|1>|\<rangle\>>>|<cell|<around*|\<langle\>|<wide|d|^><rsub|0>|\<rangle\>><rsup|2>>>>>><around*|[|\<mu\>,\<nu\>|]><rsup|T>>>|<row|<cell|>|<cell|\<equiv\>>|<cell|f<rsub|1>-\<mu\><around*|\||g<rsub|1>|\|>+<wide|d|^><rsub|0>\<cdot\><wide|g|^><rsub|1>\<nu\><around*|\||g<rsub|1>|\|>+<frac|1|2><around*|[|\<mu\>,\<nu\>|]><wide|G|^><around*|[|\<mu\>,\<nu\>|]><rsup|T>.>>>>
</eqnarray>
where
<\equation*>
<around*|\<langle\>|\<xi\>,\<eta\>|\<rangle\>>\<assign\>\<xi\><rsup|T>G\<eta\>=\<xi\><rsup|T><wide|G|^>\<eta\>,<wide|G|^>\<assign\><matrix|<tformat|<table|<row|<cell|<around*|\<langle\>|-<wide|g|^><rsub|1>|\<rangle\>><rsup|2>>|<cell|<around*|\<langle\>|<wide|d|^><rsub|0>,-<wide|g|^><rsub|1>|\<rangle\>>>>|<row|<cell|<around*|\<langle\>|<wide|d|^><rsub|0>,-<wide|g|^><rsub|1>|\<rangle\>>>|<cell|<around*|\<langle\>|<wide|d|^><rsub|0>|\<rangle\>><rsup|2>>>>>>,
</equation*>
and the elements of <math|<wide|G|^>> we have got in Lemma-5,
<\eqnarray>
<tformat|<table|<row|<cell|<around*|\<langle\>|<wide|d|^><rsub|0>|\<rangle\>><rsup|2>>|<cell|=>|<cell|<frac|<around*|\||g<rsub|1>|\|>|<around*|\||d<rsub|0>|\|>><around*|(|<wide|d|^><rsub|0>\<cdot\><wide|g|^><rsub|1>-<frac|<around*|\||g<rsub|0>|\|>|<around*|\||g<rsub|1>|\|>><wide|d<rsub|0>|^>\<cdot\><wide|g|^><rsub|0>|)>,>>|<row|<cell|<around*|\<langle\>|<wide|d|^><rsub|0>,<wide|g|^><rsub|1>|\<rangle\>>>|<cell|=>|<cell|<frac|<around*|\||g<rsub|1>|\|>|<around*|\||d<rsub|0>|\|>><around*|(|1-<frac|<around*|\||g<rsub|0>|\|>|<around*|\||g<rsub|1>|\|>><wide|g|^><rsub|0>\<cdot\><wide|g|^><rsub|1>|)>,>>|<row|<cell|<around*|\<langle\>|<wide|g|^><rsub|1>|\<rangle\>><rsup|2>>|<cell|=>|<cell|<frac|<around*|\||g<rsub|1>|\|>|d<rsub|min>><around*|(|1-<frac|<around*|\||<wide|g|~>|\|>|<around*|\||g<rsub|1>|\|>><wide|<wide|g|~>|^>\<cdot\><wide|g|^><rsub|1>|)>,<wide|x|~>=x<rsub|1>+<wide|s|~><around*|(|-<wide|g|^><rsub|1>|)>,
<wide|s|~>\<assign\>d<rsub|min>.>>>>
</eqnarray>
Note that <math|<wide|G|^>> would not be positive-definite, and even
<math|<wide|G|^>> is not necessary invertible, see the figure below.
<\big-figure|<with|gr-mode|<tuple|edit|math-at>|gr-frame|<tuple|scale|1cm|<tuple|0.519998gw|0.629988gh>>|gr-geometry|<tuple|geometry|1par|0.6par>|gr-grid|<tuple|empty>|gr-grid-old|<tuple|cartesian|<point|0|0>|1>|gr-edit-grid-aspect|<tuple|<tuple|axes|none>|<tuple|1|none>|<tuple|10|none>>|gr-edit-grid|<tuple|empty>|gr-edit-grid-old|<tuple|cartesian|<point|0|0>|1>|gr-auto-crop|true|gr-arrow-end|\<gtr\>|gr-line-width|2ln|<graphics||<with|arrow-end|\<gtr\>|<line|<point|-7|0>|<point|-1.0|0.0>>>|<with|arrow-end|\<gtr\>|<spline|<point|0|0>|<point|6.0|0.0>>>|<point|-6.0|1.0>|<cspline|<point|-4.0|0.7>|<point|-2.0|0.0>|<point|-4.0|-0.7>|<point|-6.0|0.0>>|<cspline|<point|-4.0|0.5>|<point|-2.3|0.0>|<point|-4.0|-0.5>|<point|-5.6|0.0>>|<point|-5.1|0.541187>|<point|-5.411|0.2>|<with|arrow-end|\<gtr\>|dash-style|10|<line|<point|-6|-0.5>|<point|-5.411|0.2>>>|<math-at|x<rsub|2>|<point|-6.3|-0.7>>|<math-at|x<rsub|1>|<point|-6.3|1.2>>|<with|opacity|40%|fill-color|dark
red|line-width|2ln|<carc|<point|-6|2>|<point|-5.0|1.0>|<point|-6.0|0.0>>>|<math-at|\<Delta\><rsub|1>|<point|-5.8402235745469|1.4694404021696>>|<point|4|1>|<point|-4|0>|<point|3|0>|<point|3.28586|0.3>|<spline|<point|1.0|-1.5>|<point|3.0|-0.7>|<point|5.0|-1.6>>|<spline|<point|1.0|1.7>|<point|3.0|0.8>|<point|5.0|1.7>>|<with|opacity|40%|fill-color|dark
red|line-width|2ln|<carc|<point|4|2>|<point|5.0|1.3>|<point|4.0|0.0>>>|<line|<point|0.3|-1.4>|<point|6.0|1.6>>|<line|<point|0.3|1.5>|<point|6.0|-1.6>>|<spline|<point|6.4|1.5>|<point|4.7|0.0>|<point|6.4|-1.5>>|<point|4.84626|0.4>|<math-at|x<rsub|1>|<point|3.7|1.2>>|<math-at|x<rsub|2>|<point|5.0|0.3>>|<math-at|\<Delta\><rsub|1>|<point|4.1597764254531|1.4694404021696>>|<spline|<point|-0.0872966000793756|1.39629911363937>|<point|1.47747302779174|0.0>|<point|-0.0661297790713057|-1.22838669136129>>|<with|arrow-end|\|\<gtr\>|line-width|2ln|<line|<point|-6|1>|<point|-5.411|0.2>>>|<with|arrow-end|\|\<gtr\>|line-width|2ln|<line|<point|4|1>|<point|4.84626|0.4>>>|<with|dash-style|10|<line|<point|-6|1>|<point|-4.0|0.0>>>|<with|dash-style|10|<line|<point|4|1>|<point|3.0|0.0>>>|<with|arrow-end|\<gtr\>|dash-style|10|<line|<point|-4.69185738854346|1.18810358513031>|<point|-5.0|0.7>>>|<math-at|x<rsub|2><rprime|'>|<point|-4.50135599947083|1.3362713321868>>|<with|arrow-end|\<gtr\>|dash-style|10|<line|<point|2.61069585924064|0.531932133880143>|<point|3.28586|0.3>>>|<math-at|x<rsub|2><rprime|'>|<point|2.12161253983523|0.509298443247507>>|<with|arrow-end|\<gtr\>|<spline|<point|3.0|-1.5>|<point|3.0|3.0>>>|<with|arrow-end|\<gtr\>|<line|<point|-4.0|-1.5>|<point|-4.0|3.0>>>|<spline|<point|2|-3>|<point|0.5|-4.0>|<point|2.0|-5.0>>|<point|1.235|-3.3>|<with|opacity|40%|fill-color|dark
red|<carc|<point|2|-3>|<point|1.0|-2.4>|<point|1.6|-4.0>>>|<spline|<point|1.6|-2.0>|<point|-0.5|-4.0>|<point|1.4|-5.6>>|<point|0.7|-2.5375>|<with|arrow-end|\<gtr\>|line-width|2ln|<line|<point|1.235|-3.3>|<point|0.7|-2.5375>>>|<math-at|x<rsub|1>|<point|1.3|-3.6>>|<math-at|x<rsub|2>|<point|0.5|-2.3>>|<math-at|<around*|\<nobracket\>|a|)>|<point|-4.3|-2.0>>|<math-at|<around*|\<nobracket\>|c|)>|<point|-2|-3.3>>|<with|arrow-end|\<gtr\>|<line|<point|-2.0|-4.0>|<point|3.0|-4.0>>>|<with|arrow-end|\<gtr\>|<line|<point|-1.0|-5.6>|<point|-1.0|-2.0>>>|<math-at|\<nu\>|<point|-3.5|3>>|<math-at|<around*|\<nobracket\>|b|)>|<point|4|-1.8>>|<math-at|\<mu\>|<point|-1|-0.5>>|<math-at|\<nu\>|<point|3.3|3>>|<math-at|\<mu\>|<point|6|-0.5>>|<math-at|\<mu\>|<point|3|-4.4>>|<math-at|\<nu\>|<point|-1.3|-2>>|<math-at|\<Delta\><rsub|1>|<point|1.3265312872073|-2.82677933589099>>|<math-at|m\<gtr\>0|<point|5.15773|1.95692>>|<math-at|m=0|<point|5.48572|1.32933>>|<math-at|m\<less\>0|<point|5.60223|0.813914>>>>>
Three cases of <math|m<around*|(|\<mu\>,\<nu\>|)>>: a)
<math|m<around*|(|\<mu\>,\<nu\>|)>=<frac|1|2><around*|(|\<mu\><rsup|2>+<frac|\<nu\><rsup|2>|0.1<rsup|2>>|)>>;
b) <math|m<around*|(|\<mu\>,\<nu\>|)>=<frac|1|2><around*|(|-\<mu\><rsup|2>+<frac|\<nu\><rsup|2>|0.1<rsup|2>>|)>>;
c) <math|m<around*|(|\<mu\>,\<nu\>|)>=\<nu\><rsup|2>-\<mu\>>. Note: the
best solution in the region <math|\<Delta\><rsub|1>> centered by
<math|x<rsub|1>> is <math|x<rsub|2>> not <math|x<rsub|2><rprime|'>>.
</big-figure>
From above figure, we know for different <math|<wide|G|^>>, the solutions
are different, which need addressed separately.
<\enumerate-alpha>
<item><math|<wide|G|^>> is positive-definite
a1) If the region is large-enough such that it contains the solution
given in line-search-approach. (In above figure, <math|<around*|(|0,0|)>>
is in the region)
the best <math|d<rsub|1>> is
<\equation*>
d<rsub|1>=-<wide|g|^><rsub|1>\<mu\>+<wide|d|^><rsub|0>\<nu\>=<wide|s|^><around*|(|-<wide|g|^><rsub|1>+<wide|\<beta\>|^><wide|d|^><rsub|0>|)>,
</equation*>
where
<\eqnarray>
<tformat|<table|<row|<cell|<wide|\<beta\>|^>>|<cell|=>|<cell|<frac|<around*|\<langle\>|<wide|d|^><rsub|0>,<wide|g|^><rsub|1>|\<rangle\>>-<wide|d|^><rsub|0>\<cdot\><wide|g|^><rsub|1><around*|\<langle\>|<wide|g|^><rsub|1>|\<rangle\>><rsup|2>|<around*|\<langle\>|<wide|d|^><rsub|0>|\<rangle\>><rsup|2>-<wide|d|^><rsub|0>\<cdot\><wide|g|^><rsub|1><around*|\<langle\>|<wide|d|^><rsub|0>,<wide|g|^><rsub|1>|\<rangle\>>>,>>|<row|<cell|<wide|s|^>>|<cell|=>|<cell|<around*|\||g<rsub|1>|\|><frac|1-<wide|\<beta\>|^><wide|d|^><rsub|0>\<cdot\><wide|g|^><rsub|1>|<around*|\<langle\>|<wide|g|^><rsub|1>|\<rangle\>><rsup|2>-2<wide|\<beta\>|^><around*|\<langle\>|<wide|d|^><rsub|0>,<wide|g|^><rsub|1>|\<rangle\>>+<wide|\<beta\>|^><rsup|2><around*|\<langle\>|<wide|d|^><rsub|0>|\<rangle\>><rsup|2>>.>>>>
</eqnarray>
a2) If the region is not large-enough (the case in above figure-a))
The minimum-point is at the boundary.
<item><math|<wide|G|^>> is negative-definite
The minimun-point is at the boundary.
<item><math|<wide|G|^>> is degenarate
The minimun-point is at the boundary.
<item>Note that for cases a2), b) and c), since the minimum-points are
all at the boundary, the methods to solve the minimum-point are the same.
We now address them together.
At the boundary, so <math|<around*|\||d<rsub|1>|\|><rsup|2>=R<rsub|1><rsup|2>>,
i.e., a constraint,
<\equation*>
<frac|1|2><around*|[|\<mu\>,\<nu\>|]>K<around*|[|\<mu\>,\<nu\>|]><rsup|T>-R<rsub|1><rsup|2>=0,where
K=<matrix|<tformat|<table|<row|<cell|2>|<cell|-2<wide|d|^><rsub|0>\<cdot\><wide|g|^><rsub|1>>>|<row|<cell|-2<wide|d|^><rsub|0>\<cdot\><wide|g|^><rsub|1>>|<cell|2>>>>>.
</equation*>
By Lagrange's multiplier method, one can solve consider the minimum of
<\eqnarray>
<tformat|<table|<row|<cell|<wide|m|~><around*|(|\<mu\>,\<nu\>,\<lambda\>|)>>|<cell|=>|<cell|m<around*|(|\<mu\>,\<nu\>|)>-\<lambda\><around*|(|<frac|1|2><around*|[|\<mu\>,\<nu\>|]>K<around*|[|\<mu\>,\<nu\>|]><rsup|T>-R<rsub|1><rsup|2>|)>>>|<row|<cell|>|<cell|=>|<cell|f<rsub|1>-\<mu\><around*|\||g<rsub|1>|\|>+<wide|d|^><rsub|0>\<cdot\><wide|g|^><rsub|1>\<nu\><around*|\||g<rsub|1>|\|>+<frac|1|2><around*|[|\<mu\>,\<nu\>|]><wide|<with|font|cal|G>|^><around*|[|\<mu\>,\<nu\>|]><rsup|T>+\<lambda\>R<rsub|1><rsup|2>,>>|<row|<cell|<wide|<with|font|cal|G>|^>>|<cell|\<assign\>>|<cell|<around*|(|<wide|G|^>-\<lambda\>K|)>.>>>>
</eqnarray>
Note that <math|<wide|<with|font|cal|G>|^>> must be invertible(? needs
proof), so one can solve <math|<around*|[|\<mu\>,\<nu\>|]>> as a function
of <math|\<lambda\>>, then put the solution into the constraint, to get
<math|\<lambda\>>, then he can get the final solution:
xxxxxxxxx
</enumerate-alpha>
<subsection|Strict-inequality constrained optimization problem>
In this subsection, we consider strict-inequality constrianed optimization
problem, whose form is\
<\eqnarray>
<tformat|<table|<row|<cell|>|<cell|>|<cell|min
f<around*|(|x|)>,>>|<row|<cell|s.t.>|<cell|>|<cell|c<rsub|i><around*|(|x|)>\<gtr\>0,i\<in\><with|font|cal|I>\<equiv\><around*|{|1,\<cdots\>,m|}>.<rsub|>>>>>
</eqnarray>
<\note>
We assume the problem, <math|min f<around*|(|x|)>> with
<math|c<rsub|i><around*|(|x|)>\<geqslant\>0,i\<in\><with|font|cal|I>>,
does not take its solution point at the boundary
<math|c<rsub|i><around*|(|x|)>=0,some i>.
</note>
For strict-inequality constraints, the allowed region of points is an
<with|font-series|bold|open subset>, <math|\<Omega\>>, of
<math|\<bbb-R\><rsup|n>>, which means these constraints are not useful at
the minimum point. But this does not mean in an iteration process these
constraints are not useful, since the iterative-step may too large to be in
the allowed region. To address this problem, one can use a
<with|font-series|bold|barrier method> or <strong|interior-point method> to
push the points out of the allowed region. That is to say, we consider the
problem
<\equation*>
min P<rsub|\<sigma\>><around*|(|x|)>\<assign\>f<around*|(|x|)>-\<sigma\><big|sum><rsub|i>ln
c<rsub|i><around*|(|x|)>,
</equation*>
where <math|\<sigma\>> varies as follows:
Consider the iteration seq <math|<around*|{|x<rsub|0>,x<rsub|1>,x<rsub|2>,\<cdots\>|}>>
in <math|\<Omega\>>, where <math|x<rsub|k+1>=x<rsub|k>+s<rsub|k>d<rsub|k>>.
To get <math|x<rsub|k+1>>, one can get an point
<math|<wide|x|^><rsub|k+1>=x<rsub|k>+<wide|s|^><rsub|k>d<rsub|k>> by
iterative process of <math|min P<rsub|\<sigma\>><around*|(|x|)>>, then
<\enumerate-alpha>
<item>If <math|<wide|x|^><rsub|k+1>\<in\>\<Omega\>>, then
<math|x<rsub|k+1>=<wide|x|^><rsub|k+1>>, and choose <math|\<sigma\>=0>;
<item>If <math|<wide|x|^><rsub|k+1>\<nin\>\<Omega\>>, then take
<math|s<rsub|k>> smaller than <math|<wide|s|^><rsub|k>> such that
<math|x<rsub|k+1>=x<rsub|k>+s<rsub|k>d<rsub|k>\<in\>\<Omega\>>, and
change <math|\<sigma\>=max<around*|{|2\<sigma\>,\<sigma\><rsub|0>|}>>,
where <math|\<sigma\><rsub|0>> is an input number.
</enumerate-alpha>
<\note>
One needs to make sure that all points used in the algorithm to get
<math|<around*|{|x<rsub|0>,x<rsub|1>,\<cdots\>|}>> are in the allowed
region <math|\<Omega\>>, otherwise <math|c<rsub|i>\<leqslant\>0> which
makes <math|ln c<rsub|i><around*|(|x|)>> not well-defined.
</note>
<subsection|Constrained optimization problem>
A general constrained optimization problem is of the following problem,
<\eqnarray>
<tformat|<table|<row|<cell|>|<cell|>|<cell|min
f<around*|(|x|)>,>>|<row|<cell|s.t.>|<cell|>|<cell|c<rsub|e><around*|(|x|)>=0,e\<in\><with|font|cal|E>\<equiv\><around*|{|1,2,\<cdots\>,m<rsub|e>|}>,>>|<row|<cell|>|<cell|>|<cell|c<rsub|i><around*|(|x|)>\<geqslant\>0,i\<in\><with|font|cal|I>\<equiv\><around*|{|m<rsub|e>+1,\<cdots\>,m-1,m|}>.<rsub|>>>>>
</eqnarray>
where <math|<with|font|cal|E>> is the index of constraint equations, and
<math|<with|font|cal|I>> is the index of constraint inequations.
Constrained optimization problem is much more complicate than unconstrained
optimization problem.
<\definition>
<with|font-series|bold|Feasible set>: the set of points that satisfy the
constraint, i.e.
<\equation*>
\<Omega\>\<assign\><around*|{|x<around*|\||c<rsub|i><around*|(|x|)>=0,i\<in\><with|font|cal|E>|\<nobracket\>>;c<rsub|i><around*|(|x|)>\<geqslant\>0,i\<in\><with|font|cal|I>|}>.
</equation*>
</definition>
<\definition>
<with|font-series|bold|Active set> <math|<with|font|cal|A><around*|(|x|)>>
at feasible point <math|x>: set of indices of equality constraints and
the indices of inequality constraints for which
<math|c<rsub|i><around*|(|x|)>=0>, i.e.,
<\equation*>
<with|font|cal|A><around*|(|x|)>\<assign\><with|font|cal|E>\<cup\><around*|{|i\<in\><with|font|cal|I><around*|\||c<rsub|i><around*|(|x|)>=0|\<nobracket\>>|}>.
</equation*>
An inequality constraint <math|i\<in\><with|font|cal|I>> is
<strong|active> if \ <math|c<rsub|i><around*|(|x|)>=0>, and
<strong|inactive> if <math|c<rsub|i><around*|(|x|)>\<gtr\>0>.
</definition>
<subsubsection|penalty function method>
<\definition>
<with|font-series|bold|Penalty function>,
<\equation*>
P<around*|(|x,\<sigma\>|)>\<assign\>f<around*|(|x|)>+\<sigma\><around*|(|<big|sum><rsub|e=1><rsup|m<rsub|e>><around*|[|c<rsub|e><around*|(|x|)>|]><rsup|2>+<big|sum><rsub|i=m<rsub|e>+1><rsup|m><around*|[|c<rsub|i-><around*|(|x|)>|]><rsup|2>|)>,
</equation*>
where
<\equation*>
c<rsub|i-><around*|(|x|)>=min<around*|{|0,c<rsub|i><around*|(|x|)>|}>.
</equation*>
Introduce
<\equation*>
<wide|c|\<bar\>><around*|(|x|)>=<around*|[|<wide|c|\<bar\>><rsub|1><around*|(|x|)>\<nocomma\>,\<cdots\>,<wide|c|\<bar\>><rsub|m<rsub|e>><around*|(|x|)>,<wide|c|\<bar\>><rsub|m<rsub|e>+1><around*|(|x|)>,\<cdots\>,<wide|c|\<bar\>><rsub|m><around*|(|x|)>|]>,
</equation*>
and
<\equation*>
<wide|c|\<bar\>><rsub|i><around*|(|x|)>=<choice|<tformat|<table|<row|<cell|c<rsub|i><around*|(|x|)>,>|<cell|i\<in\><with|font|cal|E>,>>|<row|<cell|c<rsub|i-><around*|(|x|)>,>|<cell|i\<in\><with|font|cal|I>.>>>>>
</equation*>
Then the penalty function can be rewroten as
<\equation*>
P<around*|(|x,\<sigma\>|)>=f<around*|(|x|)>+\<sigma\><around*|\<\|\|\>|<wide|c|\<bar\>><around*|(|x|)>|\<\|\|\>><rsup|2>.
</equation*>
</definition>
Let <math|x<around*|(|\<sigma\>|)>> be the solution of the problem
<\equation*>
min<rsub|x\<in\>\<bbb-R\><rsup|n>>f<around*|(|x|)>+\<sigma\><around*|\<\|\|\>|<wide|c|\<bar\>><around*|(|x|)>|\<\|\|\>><rsup|2>\<equiv\>min<rsub|x\<in\>\<bbb-R\><rsup|n>>
P<around*|(|x,\<sigma\>|)>,
</equation*>
the follwoing lemmas can gives the basis of the penalty function method.
<\lemma>
<math|x<around*|(|\<sigma\>|)>> is also the solution of constrained
optimization problem,
<\equation*>
min<rsub|x\<in\>\<bbb-R\><rsup|n>>f<around*|(|x|)>,s.t.
<around*|\<\|\|\>|<wide|c|\<bar\>><around*|(|x|)>|\<\|\|\>>\<leqslant\>\<delta\>,
</equation*>
where <math|\<delta\>=<around*|\<\|\|\>|<wide|c|\<bar\>><around*|(|x<around*|(|\<sigma\>|)>|)>|\<\|\|\>>>.
note: This means if we can let <math|\<delta\>\<rightarrow\>0>, we get
the solution of orginal constrained optimization problem.
</lemma>
<\proof>
Suppose <math|x> satisfing <math|<around*|\<\|\|\>|<wide|c|\<bar\>><around*|(|x|)>|\<\|\|\>>\<leqslant\>\<delta\>=<around*|\<\|\|\>|<wide|c|\<bar\>><around*|(|x<around*|(|\<sigma\>|)>|)>|\<\|\|\>>>,
then by the definition of <math|x<around*|(|\<sigma\>|)>>,
<\equation*>
f<around*|(|x<around*|(|\<sigma\>|)>|)>+\<sigma\><around*|\<\|\|\>|<wide|c|\<bar\>><around*|(|x<around*|(|\<sigma\>|)>|)>|\<\|\|\>>\<leqslant\>f<around*|(|x|)>+\<sigma\><around*|\<\|\|\>|<wide|c|\<bar\>><around*|(|x|)>|\<\|\|\>>\<leqslant\>f<around*|(|x|)>+\<sigma\><around*|\<\|\|\>|<wide|c|\<bar\>><around*|(|x<around*|(|\<sigma\>|)>|)>|\<\|\|\>>,
</equation*>
which gives the result.
</proof>
<\lemma>
If <math|\<sigma\><rsub|2>\<gtr\>\<sigma\><rsub|1>\<geqslant\>0>, then
<\equation*>
f<around*|(|x<around*|(|\<sigma\><rsub|2>|)>|)>\<geqslant\>f<around*|(|x<around*|(|\<sigma\><rsub|1>|)>|)>,
</equation*>
<\equation*>
<around*|\<\|\|\>|<wide|c|\<bar\>><around*|(|x<around*|(|\<sigma\><rsub|2>|)>|)>|\<\|\|\>>\<leqslant\><around*|\<\|\|\>|<wide|c|\<bar\>><around*|(|x<around*|(|\<sigma\><rsub|1>|)>|)>|\<\|\|\>>.
</equation*>
note: This means one can use a increasing seq <math|\<sigma\><rsub|k>>,
<\equation*>
\<sigma\><rsub|1>\<less\>\<sigma\><rsub|2>\<less\>\<cdots\>\<less\>\<sigma\><rsub|k>\<less\>\<cdots\>,
</equation*>
to give a seq of <math|\<delta\><rsub|k>=<around*|\<\|\|\>|<wide|c|\<bar\>><around*|(|x<around*|(|\<sigma\><rsub|k>|)>|)>|\<\|\|\>>>
which decreases.
</lemma>
<\proof>
Since <math|x<around*|(|\<sigma\><rsub|1>|)>> is the minimum of
<math|P<around*|(|x,\<sigma\><rsub|1>|)>>,
<\equation*>
f<around*|(|x<around*|(|\<sigma\><rsub|1>|)>|)>+\<sigma\><rsub|1><around*|\<\|\|\>|<wide|c|\<bar\>><around*|(|x<around*|(|\<sigma\><rsub|1>|)>|)>|\<\|\|\>><rsup|2>\<leqslant\>f<around*|(|x<around*|(|\<sigma\><rsub|2>|)>|)>+\<sigma\><rsub|1><around*|\<\|\|\>|<wide|c|\<bar\>><around*|(|x<around*|(|\<sigma\><rsub|2>|)>|)>|\<\|\|\>><rsup|2>.
</equation*>
If the second in-eqn is true, then this in-eqn means first in-eqn is
true, so we only need to prove second in-eqn is true.
Since <math|x<around*|(|\<sigma\><rsub|2>|)>> is the minimum of
<math|P<around*|(|x,\<sigma\><rsub|2>|)>>,
<\equation*>
f<around*|(|x<around*|(|\<sigma\><rsub|2>|)>|)>+\<sigma\><rsub|2><around*|\<\|\|\>|<wide|c|\<bar\>><around*|(|x<around*|(|\<sigma\><rsub|2>|)>|)>|\<\|\|\>><rsup|2>\<leqslant\>f<around*|(|x<around*|(|\<sigma\><rsub|1>|)>|)>+\<sigma\><rsub|2><around*|\<\|\|\>|<wide|c|\<bar\>><around*|(|x<around*|(|\<sigma\><rsub|1>|)>|)>|\<\|\|\>><rsup|2>.
</equation*>
Add above two in-eqns,
<\equation*>
<around*|(|\<sigma\><rsub|1>-\<sigma\><rsub|2>|)><around*|[|<around*|\<\|\|\>|<wide|c|\<bar\>><around*|(|x<around*|(|\<sigma\><rsub|1>|)>|)>|\<\|\|\>><rsup|2>-<around*|\<\|\|\>|<wide|c|\<bar\>><around*|(|x<around*|(|\<sigma\><rsub|2>|)>|)>|\<\|\|\>><rsup|2>|]>\<leqslant\>0,
</equation*>
which gives the second in-eqn.
</proof>
<\theorem>
If <math|min<around*|\<\|\|\>|<wide|c|\<bar\>><around*|(|x|)>|\<\|\|\>>=0>,
then a seq of <math|\<sigma\><rsub|k>> with
<math|\<sigma\><rsub|k>\<rightarrow\>\<infty\>> can give
<\equation*>
<around*|\<\|\|\>|<wide|c|\<bar\>><around*|(|x<around*|(|\<sigma\><rsub|k>|)>|)>|\<\|\|\>>\<rightarrow\>0.
</equation*>
</theorem>
<\proof>
Suppose it is not true, then <math|\<exists\>\<varepsilon\>\<gtr\>0>,
s.t.,
<\equation*>
<around*|\<\|\|\>|<wide|c|\<bar\>><around*|(|x<around*|(|\<sigma\><rsub|k>|)>|)>|\<\|\|\>>\<gtr\>\<varepsilon\>,\<forall\>k.
</equation*>
But since <math|min<around*|\<\|\|\>|<wide|c|\<bar\>><around*|(|x|)>|\<\|\|\>>=0>,
there exists <math|<wide|x|^>>, s.t.,
<\equation*>
<around*|\<\|\|\>|<wide|c|\<bar\>><around*|(|<wide|x|^>|)>|\<\|\|\>>\<less\>\<varepsilon\>.
</equation*>
By the definition of <math|x<around*|(|\<sigma\>|)>>,
<\equation*>
f<around*|(|<wide|x|^>|)>+\<sigma\><rsub|k><around*|\<\|\|\>|<wide|c|\<bar\>><around*|(|<wide|x|^>|)>|\<\|\|\>><rsup|2>\<geqslant\>f<around*|(|x<around*|(|\<sigma\><rsub|k>|)>|)>+\<sigma\><rsub|k><around*|\<\|\|\>|<wide|c|\<bar\>><around*|(|x<around*|(|\<sigma\><rsub|k>|)>|)>|\<\|\|\>><rsup|2>\<geqslant\>f<around*|(|x<around*|(|\<sigma\><rsub|1>|)>|)>+\<sigma\><rsub|k><around*|\<\|\|\>|<wide|c|\<bar\>><around*|(|x<around*|(|\<sigma\><rsub|k>|)>|)>|\<\|\|\>><rsup|2>,
</equation*>
which means
<\equation*>
<around*|\<\|\|\>|<wide|c|\<bar\>><around*|(|<wide|x|^>|)>|\<\|\|\>><rsup|2>\<geqslant\><around*|\<\|\|\>|<wide|c|\<bar\>><around*|(|x<around*|(|\<sigma\><rsub|k>|)>|)>|\<\|\|\>><rsup|2>+<frac|f<around*|(|x<around*|(|\<sigma\><rsub|1>|)>|)>-f<around*|(|<wide|x|^>|)>|\<sigma\><rsub|k>>\<gtr\>\<varepsilon\><rsup|2>,as
\<sigma\><rsub|k>\<rightarrow\>+\<infty\>.
</equation*>
</proof>
<subsubsection|Augmented Lagrange penalty function method>
Although the penalty function method is mathematically reasonable, but it
needs to make <math|\<sigma\><rsub|k>\<rightarrow\>\<infty\>> to get the
soultion which is not useful in practice. A method to address this problem
is to introduce the Lagrange's multiplier <math|\<lambda\><rsub|i>>.
To introduce the idea, let's consider equality-constrained problem,
<\equation*>
min f<around*|(|x|)> s.t. c<rsub|i><around*|(|x|)>=0,i=1,\<cdots\>,m.
</equation*>
As we know the solution, <math|x<rsup|\<ast\>>>, of the problem is also the
fixed point of Langrange's function
<\equation*>
L<around*|(|x,\<lambda\>|)>=f<around*|(|x|)>-\<lambda\><rsup|T>c<around*|(|x|)>,
</equation*>
but generally, <math|x<rsup|\<ast\>>>, with the multiplier
<math|\<lambda\><rsup|\<ast\>>>, is not the local minimum of
<math|L<around*|(|x,\<lambda\>|)>>.\
We can introduce, like above subsubsection, a penalty fuction,
<\equation*>
P<around*|(|x,\<lambda\><rsup|\<ast\>>,\<sigma\>|)>=L<around*|(|x,\<lambda\><rsup|\<ast\>>|)>+<frac|1|2>\<sigma\><around*|\<\|\|\>|c<around*|(|x|)>|\<\|\|\>><rsup|2>,
</equation*>
one can see that if <math|\<sigma\>> is large enough(but no
<math|+\<infty\>>), the penalty fuction can give a seq <math|x<rsub|k>>
which approaches to <math|x<rsup|\<ast\>>>.
<\definition>
<with|font-series|bold|Penalty function with Lagrange's multiplier>
<\eqnarray>
<tformat|<table|<row|<cell|P<around*|(|x,\<lambda\>,\<sigma\>|)>>|<cell|\<assign\>>|<cell|f<around*|(|x|)>>>|<row|<cell|>|<cell|->|<cell|<big|sum><rsub|e=1><rsup|m<rsub|e>><around*|[|\<lambda\><rsup|<around*|(|e|)>>c<rsub|e><around*|(|x|)>-<frac|1|2>\<sigma\><rsup|<around*|(|e|)>>c<rsub|e><around*|(|x|)><rsup|2>|]>>>|<row|<cell|>|<cell|->|<cell|<big|sum><rsub|i=m<rsub|e>+1><rsup|m><choice|<tformat|<table|<row|<cell|\<lambda\><rsup|<around*|(|i|)>>c<rsub|i><around*|(|x|)>-<frac|1|2>\<sigma\><rsup|<around*|(|i|)>>c<rsub|i><around*|(|x|)><rsup|2>,>|<cell|if
c<rsup|<around*|(|i|)>>\<less\>\<lambda\><rsup|<around*|(|i|)>>/\<sigma\><rsup|<around*|(|i|)>>,>>|<row|<cell|<frac|1|2>\<lambda\><rsup|<around*|(|i|)>2>/\<sigma\><rsup|<around*|(|i|)>>,>|<cell|others.>>>>>>>>>
</eqnarray>
</definition>
xxxxxx
\;
\;
\;
\;
\;
\;
<section|Algebraic equations>
<strong|Problem>: solving <math|\<b-x\>> in the eqn(s):
<\equation*>
\<b-f\><around*|(|\<b-x\>|)>=\<b-0\>.
</equation*>
<subsection|Single variable eqn>
The algebraic eqn problem for a single varible eqn is to solve <math|x> in
the eqn:
<\equation*>
f<around*|(|x|)>=0.
</equation*>
<strong|Iterative method>: Try to find an iterative sequence
<math|<around*|{|x<rsub|0>,x<rsub|1>,\<cdots\>,x<rsub|n>,x<rsub|n+1>,\<cdots\>|}>>
whose limit is just the solution: <math|x=lim<rsub|n\<rightarrow\>\<infty\>>x<rsub|n>>.
Then <math|x<rsub|n>> can approximate the solution when <math|n> is large
enough.
The problem is how to find the iterative relation.
<subsubsection|Bisection method>
Assum <math|f<around*|(|a|)>f<around*|(|b|)>\<leqslant\>0>, then by the
continuity of <math|f<around*|(|x|)>>, there exists an solution <math|x> in
<math|<around*|(|a,b|)>>.
Choose the midpoint <math|x<rsub|mid>=<frac|a+b|2>>, and we can get a
smaller interval <math|<around*|(|a<rprime|'>,b<rprime|'>|)>> which is
<math|<around*|(|a,x<rsub|mid>|)>> or <math|<around*|(|x<rsub|mid>,b|)>>
still having <math|f<around*|(|a<rprime|'>|)>f<around*|(|b<rprime|'>|)>\<leqslant\>0>.
This means an iteration, called <strong|Bisection method>:
<\eqnarray>
<tformat|<table|<row|<cell|i<rsub|0>>|<cell|=>|<cell|<around*|(|a,b|)>,>>|<row|<cell|i<rsub|n+1>>|<cell|=>|<cell|<choice|<tformat|<table|<row|<cell|<around*|(|left
<around*|(|i<rsub|n>|)>,mid<around*|(|i<rsub|n>|)>|)>,>|<cell|if
f<around*|(|left <around*|(|i<rsub|n>|)>|)>f<around*|(|mid<around*|(|i<rsub|n>|)>|)>\<leqslant\>0>>|<row|<cell|<around*|(|mid<around*|(|i<rsub|n>|)>,right<around*|(|i<rsub|n>|)>|)>,>|<cell|else>>>>>.>>>>
</eqnarray>
and
<\equation*>
x<rsub|n>=mid<around*|(|i<rsub|n>|)>.
</equation*>
<\scm-code>
;;\<less\>Scheme code\<gtr\> ------ Bisection method
;;; Problem: solve x in f(x)=0, in (a b) with f(a)*f(b)\<less\>0.
;;1. next-interval i_n =\<gtr\> i_{n+1}
(define (next-interval interval f);a interval is a pair (cons x0 x1)
\ \ (let* ([x0 (car interval)]
\ \ \ \ \ \ \ \ \ [x1 (cdr interval)]
\ \ \ \ \ \ \ \ \ [x2 (/ (+ x0 x1) 2.0)])
\ \ \ \ (if (\<less\>= (* (f x0) (f x2)) 0)
\ \ \ \ \ \ \ \ (cons x0 x2)
\ \ \ \ \ \ \ \ (cons x2 x1))))
;;2. interval stream (i0 i1 i2..)
(define (Bisection-stream f init-interval);i0 = init-interval
\ \ (stream-cons init-interval
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ (Bisection-stream f
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ (next-interval
init-interval f))))
;;3. solution-stream (x0 x1 x2 ..)
(define (Bisection-solutions f a b)
\ \ (stream-map (lambda (interval)
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ (/ (+ (car interval) (cdr interval))
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ 2.0))
\ \ \ \ \ \ \ \ \ \ \ \ \ \ (Bisection-stream f (cons a b))))
</scm-code>
<subsubsection|Iteration-function method>
The iterative sequence may be defined by an iteration function <math|g> as
<\equation*>
x<rsub|n+1>=g<around*|(|x<rsub|n>,x<rsub|n-1>,\<cdots\>,x<rsub|n-K>|)>,K\<geqslant\>0.
</equation*>
The simplest case is <math|K=0> called <strong|Fixed-point method>, i.e.
<\equation*>
x<rsub|n+1>=g<around*|(|x<rsub|n>|)>.
</equation*>
Obviously, the soluion <math|x> is just the soltion of
<math|x=g<around*|(|x|)>>, which means <math|x> is the fixed point of
<math|g>.
But how to find such an iteration function <math|g> from <math|f>?
<strong|Newton method>: Newton gives one which appears quite fast,
<\equation*>
g<around*|(|x|)>\<assign\>x-<frac|f<around*|(|x|)>|f<rprime|'><around*|(|x|)>>.
</equation*>
[*Idea*: Suppose <math|x<rsub|\<ast\>><rsub|*>> is the solution, and
<math|x> is near <math|x<rsub|\<ast\>>>, then the approximation
<math|0=f<around*|(|x<rsub|\<ast\>>|)>\<simeq\>f<around*|(|x|)>+f<rprime|'><around*|(|x|)><around*|(|x<rsub|\<ast\>>-x|)>>
gives <math|x<rsub|\<ast\>>\<simeq\>g<around*|(|x|)>>.]
The defect of Newton method is the derivative of <math|f> at <math|x> is
hard to get.
To avoid of this, one can replace <math|f<rprime|'><around*|(|x<rsub|n>|)>\<rightarrow\><frac|f<around*|(|x<rsub|n>|)>-f<around*|(|x<rsub|n-1>|)>|x<rsub|n>-x<rsub|n-1>>>
which gives an iterative function with <math|K=1>, called
<strong|Newton-Sceant method>: [scean means cut]
<\equation*>
x<rsub|n+1>=x<rsub|n>-<around*|(|x<rsub|n>-x<rsub|n-1>|)><frac|f<around*|(|x<rsub|n>|)>|f<around*|(|x<rsub|n>|)>-f<around*|(|x<rsub|n-1>|)>>.
</equation*>
<\scm-code>
;;\<less\>code\<gtr\>------- Newton-Sceant method
;;next-solution
(define (next-point x0 x1 f);(x0,x1) -\<gtr\> x2
\ \ (- x1 (* (- x1 x0) (f x1)
\ \ \ \ \ \ \ \ \ \ \ (/ 1.0 (- (f x1) (f x0))))))
;;solution stream
(define (Newton-Secant-stream f x0 x1)
\ \ (stream-cons x0
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ (Newton-Secant-stream f x1 (next-point x0
x1 f))))
</scm-code>
<subsubsection|Mixed method>
Newton-Secant method is fast, but it may happen that the iteration would
not converge; To save from this, one can mix Bisection method(safe) with
Newton-Secant method(unsafe).
Bisection method is safe since <math|x\<in\>\<cdots\>\<subset\>i<rsub|n+1>\<subset\>i<rsub|n>\<subset\>\<cdots\>\<subset\>i<rsub|0>>,
we can modify the <math|i<rsub|n>>s whose boundary is from Newton-Secant
method:
<\eqnarray>
<tformat|<table|<row|<cell|i<rsub|0>>|<cell|=>|<cell|<around*|(|a,b|)>,>>|<row|<cell|i<rsub|n+1>>|<cell|=>|<cell|<choice|<tformat|<table|<row|<cell|<around*|(|left<around*|(|i<rsub|n>|)>,x<rsub|mix>|)>,>|<cell|if
f<around*|(|x<rsub|mix>|)>f<around*|(|left<around*|(|i<rsub|n>|)>|)>\<leqslant\>0>>|<row|<cell|<around*|(|x<rsub|mix>,right<around*|(|i<rsub|n>|)>|)>,>|<cell|else>>>>>,where>>|<row|<cell|x<rsub|mix>>|<cell|=>|<cell|<choice|<tformat|<table|<row|<cell|NewtonSecant<around*|(|f,left<around*|(|i<rsub|n>|)>,right<around*|(|i<rsub|n>|)>|)>,>|<cell|if\<in\>i<rsub|n>>>|<row|<cell|mid<around*|(|i<rsub|n>|)>,>|<cell|else>>>>>.>>>>
</eqnarray>
<strong|Note:> Mixed-method is fast & safe.
<\scm-code>
;;\<less\>code\<gtr\>----- Mixed method
;;i_n =\<gtr\> i_{n+1}
(define (next-interval/mix interval f)
\ \ (let* ([x0 (car interval)]
\ \ \ \ \ \ \ \ \ [x1 (cdr interval)]
\ \ \ \ \ \ \ \ \ [x2 (let [(xns (next-point/ns x0 x1 f))
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ (xmid (/ (+ x0 x1) 2.0))]
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ (if (and (\<gtr\> xns x0) (\<less\> xns
x1))
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ xns
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ xmid))])
\ \ \ \ (if (\<less\>= (* (f x0) (f x2)) 0)
\ \ \ \ \ \ \ \ (cons x0 x2)
\ \ \ \ \ \ \ \ (cons x2 x1))))
;;interval stream (i0 i1 i2..)
(define (Mixed-stream f init-interval);i0 = init-interval
\ \ (stream-cons init-interval
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ (Mixed-stream f
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ (next-interval/mix
init-interval f))))
;;solution stream
(define (Mixed-solutions f a b)
\ \ (stream-map (lambda (interval) (if (\<less\> (abs (f (car interval)))
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ (abs
(f (cdr interval))))
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ (car
interval)
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ (cdr
interval)))
\ \ \ \ \ \ \ \ \ \ \ \ \ \ (Mixed-stream f (cons a b))))
</scm-code>
<subsubsection|Accelerating iteration: Aitken method and it's super
version>
Suppose we have an convergent seq <math|<around*|(|x<rsub|0>,x<rsub|1>,x<rsub|2>,\<cdots\>,x<rsub|n>,\<cdots\>|)>>,
and the convergency is <em|linear ordered>, then we can accelerate this
convergent seq to get a new convergent seq
<math|<around*|(|<wide|x|~><rsub|0>,<wide|x|~><rsub|1>,<wide|x|~><rsub|2>,\<cdots\>,<wide|x|~><rsub|n>,\<cdots\>|)>>:
<\equation*>
<wide|x|~><rsub|n>=x<rsub|n>-<frac|<around*|(|x<rsub|n+1>-x<rsub|n>|)><rsup|2>|x<rsub|n+2>-2x<rsub|n+1>+x<rsub|n>>.
</equation*>
This accelerating method is called <strong|Aitken method>.
[<em|*Idea*>: Let <math|x=lim<rsub|n\<rightarrow\>\<infty\>>x<rsub|n>>.
Since the convergency is supposed of linear ordered, i.e.
<\equation*>
lim<rsub|n\<rightarrow\>\<infty\>><frac|x<rsub|n+1>-x|x<rsub|n>-x>=\<lambda\>=lim<rsub|n\<rightarrow\>\<infty\>><frac|x<rsub|n+2>-x|x<rsub|n+1>-x>,
</equation*>
by solving <math|<wide|x|~>> in
<\equation*>
<frac|x<rsub|n+1>-<wide|x|~>|x<rsub|n>-<wide|x|~>>=<frac|x<rsub|n+2>-<wide|x|~>|x<rsub|n+1>-<wide|x|~>>,
</equation*>
one can get the <math|<wide|x|~>=<wide|x|~><rsub|n>> given above.\
Note that Aitken method is like Newton method without funcion which making
it quite useful.]
Further more, one can accelerate and accelerate and <text-dots> the
accelerated seq, which gives a <strong|Super acceleration method>:
<\equation*>
<tabular*|<tformat|<table|<row|<cell|seq<rsup|<around*|(|0|)>>:>|<cell|>|<cell|x<rsub|0><rsup|<around*|(|0|)>>>|<cell|x<rsub|1><rsup|<around*|(|0|)>>>|<cell|x<rsub|2><rsup|<around*|(|0|)>>>|<cell|x<rsub|3><rsup|<around*|(|0|)>>>|<cell|x<rsub|4><rsup|<around*|(|0|)>>>|<cell|x<rsub|5><rsup|<around*|(|0|)>>>|<cell|x<rsub|6><rsup|<around*|(|0|)>>>|<cell|x<rsub|7><rsup|<around*|(|0|)>>>|<cell|x<rsub|8><rsup|<around*|(|0|)>>>|<cell|\<cdots\>>>|<row|<cell|seq<rsup|<around*|(|1|)>>:>|<cell|>|<cell|>|<cell|x<rsub|0><rsup|<around*|(|1|)>>>|<cell|x<rsub|1><rsup|<around*|(|1|)>>>|<cell|x<rsub|2><rsup|<around*|(|1|)>>>|<cell|x<rsub|3><rsup|<around*|(|1|)>>>|<cell|x<rsub|4><rsup|<around*|(|1|)>>>|<cell|x<rsub|5><rsup|<around*|(|1|)>>>|<cell|x<rsub|6><rsup|<around*|(|1|)>>>|<cell|x<rsub|7><rsup|<around*|(|1|)>>>|<cell|\<cdots\>>>|<row|<cell|seq<rsup|<around*|(|2|)>>:>|<cell|>|<cell|>|<cell|>|<cell|x<rsub|0><rsup|<around*|(|2|)>>>|<cell|x<rsub|1><rsup|<around*|(|2|)>>>|<cell|x<rsub|2><rsup|<around*|(|2|)>>>|<cell|x<rsub|3><rsup|<around*|(|2|)>>>|<cell|x<rsub|4><rsup|<around*|(|2|)>>>|<cell|x<rsub|5><rsup|<around*|(|2|)>>>|<cell|x<rsub|6><rsup|<around*|(|2|)>>>|<cell|\<cdots\>>>|<row|<cell|\<vdots\>>|<cell|>|<cell|>|<cell|>|<cell|>|<cell|\<ddots\>>|<cell|\<vdots\>>|<cell|\<vdots\>>|<cell|\<vdots\>>|<cell|\<vdots\>>|<cell|\<vdots\>>|<cell|>>>>>
</equation*>
\;
<\scm-code>
;;\<less\>code\<gtr\>-- super-mixed method
;; stream of streams
(define (stream-of-streams trans s)
\ \ (stream-cons s (stream-of-streams trans (trans s))))
;; super accelerator
(define (Super-accelerator-stream accelerator s)
\ \ (stream-map stream-car (stream-of-streams accelerator s)))
;;^^^^^^^^^ above is general and can be used in other places
;; Aitken accelerator
(define (Aitken-accelerator strm)
\ \ (let* ([x0 (stream-ref strm 0)]
\ \ \ \ \ \ \ \ \ [x1 (stream-ref strm 1)]
\ \ \ \ \ \ \ \ \ [x2 (stream-ref strm 2)]
\ \ \ \ \ \ \ \ \ [denomi (+ x0 x2 (* -2 x1))])
\ \ \ \ (stream-cons (if (= denomi 0.0)
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ x2
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ (- x2 (/ (expt (- x2 x1) 2)
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ denomi)))
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ (Aitken-accelerator (stream-cdr
strm)))))
\;
;; The final solutions
(define (Super-Mixed-stream f a b)
\ \ (Super-accelerator-stream Aitken-accelerator (Mixed-solutions f a
b)))
</scm-code>
<\note*>
(1) Iteration function method is a general method to solve many problems;
(2) Aitken acceleration method can also be used in many convergent
secquences.
</note*>
<subsection|Linear eqns>
<strong|Problem:> Solve <math|<wide|x|\<vect\>>> in the linear eqns
<\equation*>
A<wide|x|\<vect\>>=<wide|b|\<vect\>>,
</equation*>
where <math|det<around*|(|A|)>\<neq\>0>.
<subsubsection|Direct methods>
Direct methods can solve not too high ordered system. One direct method is
the famous <strong|Gauss elimination method>, the other one is
<strong|least squares(LS) method> which uses the Cholesky decomposition.
Here we consider this method.
Since <math|A<rsup|-1>> exists, <math|A<rsup|T>A\<equiv\>C> is symmetric
and positive. Then by the <strong|Cholesky decomposition>, there exists a
unique non-singular lower trangular matrix <math|L>, s.t.,
<math|A<rsup|T>A=L L<rsup|T>>, whose the explicit expression is
<\eqnarray>
<tformat|<table|<row|<cell|l<rsub|i i>>|<cell|=>|<cell|<sqrt|c<rsub|i
i>-<big|sum><rsub|k=1><rsup|i-1>l<rsub|i
k><rsup|2>>,>>|<row|<cell|l<rsub|i j>>|<cell|=>|<cell|<around*|(|c<rsub|i
j>-<big|sum><rsub|k=1><rsup|j-1>l<rsub|i k>l<rsub|j k>|)>/l<rsub|j
j>,where i\<gtr\>j.>>>>
</eqnarray>
\;
Thus one can solve <math|<wide|x|\<vect\>>> in two steps:\
<\eqnarray>
<tformat|<table|<row|<cell|L<wide|y|\<vect\>>>|<cell|=>|<cell|A<rsup|T>
<wide|b|\<vect\>>,>>|<row|<cell|L<rsup|T><wide|x|\<vect\>>>|<cell|=>|<cell|<wide|y|\<vect\>>.>>>>
</eqnarray>
Note that there is an impoved version using <math|L D
L<rsup|T>>-decomposion without sqrt operation.
<subsubsection|Iterative methods>
Here we consider iterative method, i.e. to find a matrix <math|G>, such
that the orginal problem above is equivalant to
<\equation>
<wide|x|\<vect\>>=G<wide|x|\<vect\>>+<wide|h|\<vect\>><label|iteration-eqns-linear>,
</equation>
with <math|det<around*|(|I-G|)>\<neq\>0>.
The advantages of iterative method are: (i) It can solve large ordered
system; (ii) It can be solved in parallel; (ii) It can save from the
error-accumulations compared to Gauss elimination method.
But when would an iteration method(<reference|iteration-eqns-linear>) fail?
<\theorem>
The iterative seq by iteration relation(<reference|iteration-eqns-linear>)
is convergent if and only if the spectrum of <math|G> is smaller than 1,
<\equation*>
\<rho\><around*|(|G|)>\<less\>1,
</equation*>
where <math|\<rho\><around*|(|G|)>=max<around*|{|<around*|\||\<lambda\><rsub|i>|\|>|}>>,
<math|\<lambda\><rsub|i>>s are the eigenvalues of <math|G>.
<\proof>
Define the difference <math|<wide|e|\<vect\>><rsub|n+1>=<wide|x|\<vect\>><rsub|n+1>-<wide|x|\<vect\>><rsub|n>>,
then by <math|<wide|x|\<vect\>><rsub|n+1>=G<wide|x|\<vect\>><rsub|n>+<wide|h|\<vect\>>>
and <math|<wide|x|\<vect\>><rsub|n>=G<wide|x|\<vect\>><rsub|n-1>+<wide|h|\<vect\>>>,
one gets the relations of difference
<math|<wide|e|\<vect\>><rsub|n+1>=G<wide|e|\<vect\>><rsub|n>>, which
means
<\equation*>
<wide|e|\<vect\>><rsub|n>=G<rsup|n><wide|e|\<vect\>><rsub|0>.
</equation*>
By the Cauchy convergency thm, <math|<around*|{|<wide|x|\<vect\>><rsub|n>|}>>
converges iff <math|<wide|e|\<vect\>><rsub|n>\<rightarrow\>0>, which
also equals to <math|G<rsup|n>\<rightarrow\>O>.
Since any matrix can be formed by Jordan blocks, i.e.
<\equation*>
G=S <around*|[|J<rsub|a>|]> S<rsup|-1>,where
J<rsub|a>=\<lambda\><rsub|a>E<rsub|a>+\<Delta\><rsub|a>,\<Delta\><rsub|a>=<around*|(|<tabular*|<tformat|<table|<row|<cell|0>|<cell|1>|<cell|>|<cell|>|<cell|>>|<row|<cell|>|<cell|0>|<cell|1>|<cell|O>|<cell|>>|<row|<cell|>|<cell|>|<cell|0>|<cell|\<ddots\>>|<cell|>>|<row|<cell|>|<cell|O>|<cell|>|<cell|0>|<cell|1>>|<row|<cell|>|<cell|>|<cell|>|<cell|>|<cell|0>>>>>|)><rsub|a\<times\>a>,
</equation*>
and <math|\<Delta\><rsub|a><rsup|a>=O>, we have for <math|n\<gtr\>a>,
<math|S<rsup|-1>G<rsup|n>S=<around*|[|J<rsub|a><rsup|n>|]>>, where
<\equation*>
J<rsub|a><rsup|n>=\<lambda\><rsub|a><rsup|n>E<rsub|a>+C<rsub|n><rsup|1>\<lambda\><rsup|n-1><rsub|a>\<Delta\><rsub|a>+\<cdots\>+C<rsub|n><rsup|a-1>\<lambda\><rsub|a><rsup|n-a+1>\<Delta\><rsub|a><rsup|a-1>.
</equation*>
If all <math|<around*|\||\<lambda\><rsub|a>|\|>\<less\>1>, each term
<math|C<rsub|n><rsup|j>\<lambda\><rsub|a><rsup|n-j>\<rightarrow\>0,as
n\<rightarrow\>\<infty\>>, which means <math|G<rsup|n>\<rightarrow\>O>;
On the other hand, if there exists one eignevalue
<math|\<lambda\><rsub|a>> has <math|<around*|\||\<lambda\><rsub|a>|\|>\<geqslant\>1>,
<math|C<rsub|n><rsup|j>\<lambda\><rsup|n-j><rsub|a>\<rightarrow\>\<infty\>,j\<geqslant\>1,as
n\<rightarrow\>\<infty\>>, i.e. <math|<around*|(|J<rsub|a>|)><rsub|i,j\<gtr\>i>\<rightarrow\>\<infty\>>.
\;
</proof>
</theorem>
<\note>
Could the Aitken method be used here? The answer is yes!
Consider <math|J<rsub|a>=\<lambda\>E+\<Delta\>> with
<math|<around*|\||\<lambda\>|\|>\<less\>1>, since each term such as
<math|<around*|(|i,i+j|)>> element of <math|J<rsub|a><rsup|n>> converges
as linear-ordered: <math|<frac|C<rsub|n+1><rsup|j>\<lambda\><rsup|n+1-j>-0|C<rsub|n><rsup|j>\<lambda\><rsup|n-j>-0>\<rightarrow\>\<lambda\>>.
</note>
The key problem is to find such appropriate <math|G> and
<math|<wide|h|\<vect\>>>.
The following parts are all about iterative methods.
<subsubsection|Jacobi method>
Assume <math|a<rsub|i i>\<neq\>0,\<forall\>i>,[<em|don't worry about this>]
and let <math|D=diag<around*|(|a<rsub|11>,\<cdots\>,a<rsub|n n>|)>>, and
split <math|A> into <math|A=D+Q>, then:
<\equation>
<wide|x|\<vect\>>=-D<rsup|-1>Q<wide|x|\<vect\>>+D<rsup|-1><wide|b|\<vect\>><label|Jacobi-iteration-func>,
</equation>
thus we get an iteration method called <strong|Jacobi method> with
<\eqnarray>
<tformat|<table|<row|<cell|G>|<cell|=>|<cell|-D<rsup|-1>Q=<around*|(|<tabular*|<tformat|<table|<row|<cell|0>|<cell|-<frac|a<rsub|12>|a<rsub|11>>>|<cell|\<cdots\>>|<cell|-<frac|a<rsub|1n>|a<rsub|11>>>>|<row|<cell|-<frac|a<rsub|21>|a<rsub|22>>>|<cell|0>|<cell|\<cdots\>>|<cell|-<frac|a<rsub|2n>|a<rsub|22>>>>|<row|<cell|\<vdots\>>|<cell|\<vdots\>>|<cell|\<ddots\>>|<cell|\<nosymbol\>\<vdots\>>>|<row|<cell|-<frac|a<rsub|n1>|a<rsub|n
n>>>|<cell|-<frac|a<rsub|n2>|a<rsub|n
n>>>|<cell|>|<cell|0>>>>>|)>,>>|<row|<cell|<wide|h|\<vect\>>>|<cell|=>|<cell|D<rsup|-1><wide|b|\<vect\>>=<around*|(|<frac|b<rsub|1>|a<rsub|11>>,\<cdots\>,<frac|b<rsub|n>|a<rsub|n
n>>|)>.>>>>
</eqnarray>
More explicit expression (also faster and easier) is the component form:
<\equation*>
x<rsub|i><rsup|<around*|(|n+1|)>>=<frac|1|a<rsub|i
i>><around*|(|b<rsub|i>-<big|sum><rsub|j\<neq\>i>a<rsub|i
j>x<rsub|j><rsup|<around*|(|n|)>>|)>.
</equation*>
<\scm-code>
;;\<less\>code\<gtr\>-- Jacobi method
;; next-solution
(define (nextstate-Jacobi A b x n);;n=dim(x)
\ \ (vector-map
\ \ \ (lambda (i)
\ \ \ \ \ (/ (- (vector-ref b i)
\ \ \ \ \ \ \ \ \ \ \ (apply + (map (lambda (j)
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ (if (= j i)
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ 0
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ (*
(matrix-ref A i j)
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ (vector-ref
x j))))
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ (range n))))
\ \ \ \ \ \ \ \ (matrix-ref A i i)))
\ \ \ (vector-range n)))
</scm-code>
<subsubsection|Gauss-Siedel method>
Assume <math|a<rsub|i i>\<neq\>0,\<forall\>i>, and let
<math|D=diag<around*|(|a<rsub|11>,\<cdots\>,a<rsub|n n>|)>>, then split
<math|A> into three parts: diagonal part, lower-triangular <math|L> and
upper-triangular <math|U>, <math|A=D+L+U>, then
<math|<around*|(|D+L|)><wide|x|\<vect\>>=-U<wide|x|\<vect\>>+<wide|b|\<vect\>>>,
which gives
<\equation>
<wide|x|\<vect\>>=-<around*|(|D+L|)><rsup|-1>U<wide|x|\<vect\>>+<around*|(|D+L|)><rsup|-1><wide|b|\<vect\>><label|Gauss-Siedel-iteration-func>,
</equation>
which gives
<\eqnarray>
<tformat|<table|<row|<cell|G>|<cell|=>|<cell|-<around*|(|D+L|)><rsup|-1>U,>>|<row|<cell|<wide|h|\<vect\>>>|<cell|=>|<cell|<around*|(|D+L|)><rsup|-1><wide|b|\<vect\>>.>>>>
</eqnarray>
These eqns are not hard as it looked like to be: By the iteration
<\equation*>
<around*|(|D+L|)><wide|x|\<vect\>><rsup|<around*|(|n+1|)>>=-U<wide|x|\<vect\>><rsup|<around*|(|n|)>>+<wide|b|\<vect\>>\<Longrightarrow\>D<wide|x|\<vect\>><rsup|<around*|(|n+1|)>>=-L<wide|x|\<vect\>><rsup|<around*|(|n+1|)>>-U<wide|x|\<vect\>><rsup|<around*|(|n|)>>+<wide|b|\<vect\>>,
</equation*>
since <math|L> is lower-triangular, when to solve
<math|x<rsup|<around*|(|n+1|)>><rsub|j>>, rhs does not have
<math|x<rsup|<around*|(|n+1|)>><rsub|l\<geqslant\>j>>.
The explicit expresion is
<\equation*>
x<rsub|i><rsup|<around*|(|n+1|)>>=<frac|1|a<rsub|i
i>><around*|(|b<rsub|i>-<big|sum><rsub|j\<less\>i>a<rsub|i
j>x<rsub|j><rsup|<around*|(|n+1|)>>-<big|sum><rsub|j\<gtr\>i>a<rsub|i
j>x<rsub|j><rsup|<around*|(|n|)>>|)>.
</equation*>
<\note>
(i) Gauss-Siedel method is a modification of Jacobi method by
<em|updating those <math|x<rsub|j>>s that haved been got>;
(ii) Although simple, but Gauss-Siedel method can greatly improve the
Jacobi method.
(ii) The element <math|x<rsub|i>> updated in the order
<math|x<rsub|1>,x<rsub|2>,\<cdots\>>, but it seems not necessary, so
which order is the best? updating the larger
<math|<around*|\||<frac|a<rsub|i j>|a<rsub|i i>>|\|>> one is better?
</note>
<\scm-code>
;;\<less\>code\<gtr\>---GaussSiedel method
;;next-solution
(define (nextstate-GaussSiedel A b x n)
\ \ (do ((newx (vector-copy x))
\ \ \ \ \ \ \ (i 0 (+ i 1)))
\ \ \ \ \ \ ((= i n) newx);;newx
\ \ \ \ (vector-set! newx i
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ (/ (- (vector-ref b i)
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ (do ((sum 0)
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ (j 0 (+ j 1)))
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ ((= j n) sum);;sum
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ (if (not (= j i))
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ (set! sum (+
sum
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ (*
(matrix-ref A i j)
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ (vector-ref
newx j);not x
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ ))))))
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ (matrix-ref A i i)))))
</scm-code>
<subsubsection|Successive-Over-Relaxation(SOR) method>
I found the Aitken method is much faster then SOR method, so this method is
omitted.
<subsubsection|Aitken method and its super version>
It has been shown that the Aitken method can be used in the iterative
method to solve linear eqns, so in this part, we give the code.
<\scm-code>
;;\<less\>code\<gtr\>--- super-Jacobi/GaussSiedel method
;;;;;-----------Aitken acceleration--------------------
(define (Aitken-acceleration strm)
\ \ (let* ([s0 (stream-ref strm 0)]
\ \ \ \ \ \ \ \ \ [s1 (stream-ref strm 1)]
\ \ \ \ \ \ \ \ \ [s2 (stream-ref strm 2)]
\ \ \ \ \ \ \ \ \ [news (vector-map (lambda (e0 e1 e2);;vector-map
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ (if (zero? (+
e0 e2 (* -2 e1)));;denominate is 0
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ e2
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ (- e2
(/ (expt (- e2 e1) 2)
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ (+
e0 e2 (* -2 e1))))))
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ s0 s1 s2)])
\ \ \ \ (stream-cons news (Aitken-acceleration (stream-cdr strm)))))
;;;;;-------------super acceleration--------------------
(define (super-Jacobi-strm A b n x0)
\ \ (super-acclerator Aitken-acceleration (Jacobi-stream-from A b n x0)))
(define (super-GaussSiedel-strm A b n x0)
\ \ (super-acclerator Aitken-acceleration (GaussSiedel-stream-from A b n
x0)))
\;
</scm-code>
<subsubsection|The convergency problem>
A problem has not been answered, i.e., when do Jacobi method and
Gauss-Siedel method work and fail. The thm1 given before only says the
iteration matrix <math|G>, but not <math|A>.
Here is a theorem tells when Jacobi method and Gauss-Siedel method work,
<\theorem>
If <math|A> is strickly diagonally dominant, i.e.,
<\equation*>
<big|sum><rsub|i<around*|(|\<neq\>j|)>><around*|\||<frac|a<rsub|i
j>|a<rsub|i i>>|\|>\<less\>1,\<forall\>j,
</equation*>
then both Jacobi and Gauss-Siedel methods can give the unique solution.
</theorem>
Here is another one which in some sense solves <strong|all problems>:
<\theorem>
If <math|A> is symmetric and positive, then Gauss-Siedel method can give
the unique solution.
</theorem>
<\note>
Previous we said to solve <math|A<wide|x|\<vect\>>=<wide|b|\<vect\>>>, we
can solve <math|A<rsup|T>A<wide|x|\<vect\>>=A<rsup|T><wide|b|\<vect\>>>
instead and <math|A<rsup|T>A> is just symmentric and positive!
</note>
<subsection|Non-linear eqns>
<strong|Problem:> Solve <math|<wide|x|\<vect\>>\<in\>\<bbb-R\><rsup|n>> of
non-linar eqns
<\equation*>
<wide|f|\<vect\>><around*|(|<wide|x|\<vect\>>|)>=0.
</equation*>
<subsubsection|Newton method>
Like one variable case, suppose <math|<wide|x|\<vect\>><rsub|\<ast\>>> is
the (unique) solution, and <math|<wide|x|\<vect\>>> is a supposed
approximation of <math|<wide|x|\<vect\>><rsub|\<ast\>>>, by
<\equation*>
0=<wide|f|\<vect\>><around*|(|<wide|x|\<vect\>><rsub|\<ast\>>|)>\<simeq\><wide|f|\<vect\>><around*|(|<wide|x|\<vect\>>|)>+<wide|f|\<vect\>><rprime|'><around*|(|<wide|x|\<vect\>>|)><around*|(|<wide|x|\<vect\>><rsub|\<ast\>>-<wide|x|\<vect\>>|)>,
</equation*>
where <math|<around*|(|<wide|f|\<vect\>><rprime|'><around*|(|x|)>|)><rsub|i
j>\<assign\><around*|(|<frac|\<partial\>f<rsub|i>|\<partial\>x<rsub|j>>|)>>
is the <em|Jacobi matrix>, i.e. <em|Frechet derivative> of
<math|<wide|f|\<vect\>><around*|(|<wide|x|\<vect\>>|)>>. Then one can solve
<math|<wide|x|\<vect\>><rsub|\<ast\>>> as
<math|<wide|x|\<vect\>><rsub|\<ast\>>=<wide|x|\<vect\>>-<wide|f|\<vect\>><rprime|'><around*|(|x|)><rsup|-1><wide|f|\<vect\>><around*|(|<wide|x|\<vect\>>|)>>,
which gives an iterative method called <strong|Newton method>:
<\equation>
<wide|x|\<vect\>><rsub|k+1>=<wide|x|\<vect\>><rsub|k>-<wide|f|\<vect\>><rprime|'><around*|(|<wide|x|\<vect\>><rsub|k>|)><rsup|-1><wide|f|\<vect\>><around*|(|<wide|x|\<vect\>><rsub|k>|)><label|Newton-method-multivars>.
</equation>
To avoid of solving the inverse <math|<wide|f|\<vect\>><rprime|'><rsup|-1>>,
one can take the following two steps:
<\eqnarray>
<tformat|<table|<row|<cell|<wide|f|\<vect\>><rprime|'><around*|(|<wide|x|\<vect\>><rsub|k>|)><around*|(|-\<Delta\><wide|x|\<vect\>><rsub|k>|)>>|<cell|=>|<cell|<wide|f|\<vect\>><around*|(|<wide|x|\<vect\>><rsub|k>|)>,>>|<row|<cell|<wide|x|\<vect\>><rsub|k+1>>|<cell|=>|<cell|<wide|x|\<vect\>><rsub|k>-<around*|(|-\<Delta\><wide|x|\<vect\>><rsub|k>|)>.>>>>
</eqnarray>
<\note>
The derivative <math|<wide|f|\<vect\>><rprime|'><rsub|i
j>=<frac|\<partial\>f<rsub|i>|\<partial\>x<rsub|j>><around*|\||<wide|x|\<vect\>><rsub|k>|\<nobracket\>>>
is still hard to compute, but one can use the sceant method which is
similar to the one variable case, which is shown in the next
subsubsection.
</note>
<subsubsection|Newton-Sceant method>
\;
We need <math|<around*|(|n+1|)>> points
<math|<wide|\<xi\>|\<vect\>><rsub|k,0><around*|(|\<equiv\><wide|x|\<vect\>><rsub|k>|)>,<wide|\<xi\>|\<vect\>><rsub|k,1>\<cdots\>,<wide|\<xi\>|\<vect\>><rsub|k,n>>
and the values of <math|<wide|f|\<vect\>>> at these points, such that
<\equation*>
<around*|(|<wide|\<xi\>|\<vect\>><rsub|k,0>,f<rsub|i><around*|(|<wide|\<xi\>|\<vect\>><rsub|k,0>|)>|)>,\<cdots\>,<around*|(|<wide|\<xi\>|\<vect\>><rsub|k,n>,f<rsub|i><around*|(|<wide|\<xi\>|\<vect\>><rsub|k,n>|)>|)>
</equation*>
can expand a superplane in <math|<around*|(|<wide|x|\<vect\>>,z<rsub|i>|)>>-space,
whose expression is <math|z<rsub|i>=<wide|a|\<vect\>><rsup|i,T><wide|x|\<vect\>>+b<rsup|i>>
and can approxmiates <math|z<rsub|i>=f<rsub|i><around*|(|<wide|x|\<vect\>>|)>>
near <math|<around*|(|<wide|x|\<vect\>><rsub|k>,f<rsub|i><around*|(|<wide|x|\<vect\>><rsub|k>|)>|)>>
for all <math|i=1,\<cdots\>,n>. Thus we have an approximation
<math|<frac|\<partial\>f<rsub|i>|\<partial\>x<rsub|j>><around*|\||<wide|x|\<vect\>><rsub|k>|\<nobracket\>>\<simeq\><wide|a|\<vect\>><rsup|i><around*|[|j|]>>.
Those <math|<wide|a|\<vect\>><rsup|i>> and <math|b<rsup|i>> satisfy
<\equation*>
f<rsub|i><around*|(|<wide|\<xi\>|\<vect\>><rsub|k,j>|)>=<wide|a|\<vect\>><rsup|i,T><wide|\<xi\>|\<vect\>><rsub|k,j>+b<rsup|i>,\<forall\>i=1,\<cdots\>,n;j=0,1,\<cdots\>,n.
</equation*>
By minusing the eqns (jth - 0th), one can get
<\equation*>
f<rsub|i><around*|(|<wide|\<xi\>|\<vect\>><rsub|k,j>|)>-f<rsub|i><around*|(|<wide|x|\<vect\>><rsub|k>|)>=<wide|a|\<vect\>><rsup|i,T><around*|(|<wide|\<xi\>|\<vect\>><rsub|k,j>-<wide|x|\<vect\>><rsub|k>|)>,i,j=1,\<cdots\>,n,
</equation*>
which is the <math|<around*|(|i,j|)>>-element of
<\equation*>
\<Gamma\><rsub|k>=A<rsub|k>H<rsub|k>,
</equation*>
where\
<\eqnarray>
<tformat|<table|<row|<cell|\<Gamma\><rsub|k>>|<cell|\<assign\>>|<cell|<around*|[|<wide|f|\<vect\>><around*|(|<wide|\<xi\>|\<vect\>><rsub|k,1>|)>-<wide|f|\<vect\>><around*|(|<wide|x|\<vect\>><rsub|k>|)>,\<ldots\>,<wide|f|\<vect\>><around*|(|<wide|\<xi\>|\<vect\>><rsub|k,n>|)>-<wide|f|\<vect\>><around*|(|<wide|x|\<vect\>><rsub|k>|)>|]>,>>|<row|<cell|A<rsub|k>>|<cell|\<assign\>>|<cell|<around*|[|<tabular*|<tformat|<table|<row|<cell|<wide|a|\<vect\>><rsup|1,T>>>|<row|<cell|\<vdots\>>>|<row|<cell|<wide|a|\<vect\>><rsup|n,T>>>>>>|]>,>>|<row|<cell|H<rsub|k>>|<cell|\<assign\>>|<cell|<around*|[|<wide|\<xi\>|\<vect\>><rsub|k,1>-<wide|x|\<vect\>><rsub|k>,\<cdots\>,<wide|\<xi\>|\<vect\>><rsub|k,n>-<wide|x|\<vect\>><rsub|k>|]>.>>>>
</eqnarray>
Note that <math|<around*|[|<wide|f|\<vect\>><rprime|'><around*|(|<wide|x|\<vect\>><rsub|k>|)>|]><rsub|i
j>\<simeq\><wide|a|\<vect\>><rsup|i><around*|[|j|]>=<around*|[|A<rsub|k>|]><rsub|i
j>>, and we get the <strong|Newton-Sceant method>:
<\equation*>
<wide|x|\<vect\>><rsub|k+1>=<wide|x|\<vect\>><rsub|k>-A<rsub|k><rsup|-1><wide|f|\<vect\>><around*|(|<wide|x|\<vect\>><rsub|k>|)>=<wide|x|\<vect\>><rsub|k>-H<rsub|k>\<Gamma\><rsub|k><rsup|-1><wide|f|\<vect\>><around*|(|<wide|x|\<vect\>><rsub|k>|)>,
</equation*>
whose computation procedure is
<\equation>
<choice|<tformat|<table|<row|<cell|<wide|x|\<vect\>><rsub|k+1>>|<cell|=>|<cell|<wide|x|\<vect\>><rsub|k>-H<rsub|k><wide|z|\<vect\>><rsub|k>>>|<row|<cell|\<Gamma\><rsub|k><wide|z|\<vect\>><rsub|k>>|<cell|=>|<cell|<wide|f|\<vect\>><around*|(|<wide|x|\<vect\>><rsub|k>|)>>>>>><label|Newton-Sceant-multivars>.
</equation>
Now we need to choose the <math|n> points
<math|<around*|(|<wide|\<xi\>|\<vect\>><rsub|k,1>,\<cdots\>,<wide|\<xi\>|\<vect\>><rsub|k,n>|)>>.
These points generally depend on <math|<wide|x|\<vect\>><rsub|k>,<wide|x|\<vect\>><rsub|k-1>,\<cdots\>,<wide|x|\<vect\>><rsub|0>>,
where if there are <math|p> points are used, we call it
\P<em|<math|p>-point sceant method>\Q, while if
<math|<around*|(|<wide|x|\<vect\>><rsub|k>,\<cdots\>,<wide|x|\<vect\>><rsub|k-p+1>|)>>
are used, we call the method \P<em|<math|p>-point sequential sceant
method>\Q.
Here we give \P<strong|2-point sequential sceant method>\Q:
We only use 2-point <math|<around*|(|<wide|x|\<vect\>><rsub|k>,<wide|x|\<vect\>><rsub|k-1>|)>>
and let
<\equation*>
<choice|<tformat|<table|<row|<cell|h<rsub|k,j>\<assign\><wide|x|\<vect\>><rsub|k-1><around*|[|j|]>-<wide|x|\<vect\>><rsub|k><around*|[|j|]>>>|<row|<cell|<wide|\<xi\>|\<vect\>><rsub|k,j>\<assign\><wide|x|\<vect\>><rsub|k>+h<rsub|k,j><wide|e|\<vect\>><rsub|j>>>>>>,
</equation*>
where <math| j=1,\<cdots\>,n> and <math|<wide|e|\<vect\>><rsub|j>=<around*|[|0,\<ldots\>,1<rsub|j>,\<ldots\>0|]><rsup|T>>.
Thus,
<\eqnarray>
<tformat|<table|<row|<cell|H<rsub|k>>|<cell|=>|<cell|diag<around*|(|h<rsub|k,1>,\<cdots\>,h<rsub|k,n>|)>,>>|<row|<cell|\<Gamma\><rsub|k>>|<cell|=>|<cell|<around*|[|<wide|f|\<vect\>><around*|(|<wide|x|\<vect\>><rsub|k>+h<rsub|k,1><wide|e|\<vect\>><rsub|1>|)>-<wide|f|\<vect\>><around*|(|<wide|x|\<vect\>><rsub|k>|)>,\<cdots\>,<wide|f|\<vect\>><around*|(|<wide|x|\<vect\>><rsub|k>+h<rsub|k,n><wide|e|\<vect\>><rsub|n>|)>-<wide|f|\<vect\>><around*|(|<wide|x|\<vect\>><rsub|k>|)>|]>.>>>>
</eqnarray>
[See P197 in the book of Ref-1, where there is an alternative method can
lessing some calculations.]
Note that for Newton-Sceant method, in each step, one has to solve a
linear-eqns, which is a large work.
<\note>
Althogh Newton-Sceant method gives a method to get an approximation of
Jacobian in Newton method, it still needs to solve \ a linear-eqns in
each step, which is a large work.
Quasi-Newton methods can avoid the linear-eqns solving step and give the
itetation directly, see the next subsubsetion.
</note>
<subsubsection|Quasi-Newton method>
Both Newton method an Newton-Secant method, solving the eqns is to get an
iterative relation as
<\equation*>
<wide|x|\<vect\>><rsub|k+1>=<wide|x|\<vect\>><rsub|k>-A<rsub|k+1><rsup|-1><wide|f|\<vect\>><around*|(|<wide|x|\<vect\>><rsub|k>|)>\<equiv\><wide|x|\<vect\>><rsub|k>-H<rsub|k><wide|f|\<vect\>><around*|(|<wide|x|\<vect\>><rsub|k>|)>.
</equation*>
In Newton method, <math|A<rsub|k+1>=<wide|f|\<vect\>><rprime|'><around*|(|<wide|x|\<vect\>><rsub|k>|)>>,
and in Newton method, <math|A<rsub|k>> is an approximation of
<math|<wide|f|\<vect\>><rprime|'><around*|(|<wide|x|\<vect\>><rsub|k>|)>>
by secant.
Let
<\eqnarray>
<tformat|<table|<row|<cell|\<Delta\><wide|x|\<vect\>><rsub|k>>|<cell|\<assign\>>|<cell|<wide|x|\<vect\>><rsub|k+1>-<wide|x|\<vect\>><rsub|k>,>>|<row|<cell|\<Delta\><wide|y|\<vect\>><rsub|k>>|<cell|\<assign\>>|<cell|<wide|f|\<vect\>><around*|(|<wide|x|\<vect\>><rsub|k+1>|)>-<wide|f|\<vect\>><around*|(|<wide|x|\<vect\>><rsub|k>|)>,>>>>
</eqnarray>
and since
<\equation>
<wide|f|\<vect\>><around*|(|<wide|x|\<vect\>><rsub|k+1>|)>-<wide|f|\<vect\>><around*|(|<wide|x|\<vect\>><rsub|k>|)>\<simeq\><wide|f|\<vect\>><rprime|'><around*|(|<wide|x|\<vect\>><rsub|k>|)>\<Delta\><wide|x|\<vect\>><rsub|k><label|linear-approx>,
</equation>
one can get (let <math|A<rsub|k>=<wide|f|\<vect\>><rprime|'><around*|(|<wide|x|\<vect\>><rsub|k>|)>>
or <math|H<rsub|k>=<wide|f|\<vect\>><rprime|'><around*|(|<wide|x|\<vect\>><rsub|k>|)><rsup|-1>>)
<\equation*>
\<Delta\><wide|y|\<vect\>><rsub|k>=A<rsub|k>\<Delta\><wide|x|\<vect\>><rsub|k>,
</equation*>
or
<\equation>
\<Delta\><wide|x|\<vect\>><rsub|k>=H<rsub|k>\<Delta\><wide|y|\<vect\>><rsub|k><label|iterative-eqns>.
</equation>
Note that what we need is to make the eqns(<reference|iterative-eqns>)
satisfied, but <math|H<rsub|k>> is not necessary to approximate
<math|<wide|f|\<vect\>><rprime|'><around*|(|<wide|x|\<vect\>><rsub|k>|)><rsup|-1>>
which opens the road to find other methods.
A <em|quasi Newton method> is to give a iterative relation of
<math|<around*|{|A<rsub|k>|}>> or <math|<around*|{|H<rsub|k>|}>> as
<\equation>
H<rsub|k+1>=H<rsub|k>+E<rsub|k><label|quasi-newton>,
</equation>
making the iterative-eqns(<reference|iterative-eqns>) satisfied, where
<math|E<rsub|k>> is called the <strong|correcting-matrix> . Different
choices of <math|E<rsub|k>> define different quasi-Newton methods.
Here we give the best known framework: <strong|BFS method>(from Broyden,
Fletcher and Shanmo), where <math|E<rsub|k>> is a rank-2 matrix.
Let <math|E<rsub|k>> be a rank-2 matrix,
<\equation*>
E<rsub|k>=U<rsub|k>V<rsub|k><rsup|T>,
</equation*>
where <math|U<rsub|k>,V<rsub|k>> are <math|n\<times\>2> matrix, and
<math|U<rsub|k>=<around*|[|<wide|u|\<vect\>><rsub|k><rsup|<around*|(|1|)>>,<wide|u|\<vect\>><rsub|k><rsup|<around*|(|2|)>>|]>,V<rsub|k>=<around*|[|<wide|v|\<vect\>><rsub|k><rsup|<around*|(|1|)>>,<wide|v|\<vect\>><rsub|k><rsup|<around*|(|2|)>>|]>>,
then
<\equation*>
E<rsub|k>=<wide|u|\<vect\>><rsub|k><rsup|<around*|(|1|)>><wide|v|\<vect\>><rsub|k><rsup|<around*|(|1|)>T>+<wide|u|\<vect\>><rsub|k><rsup|<around*|(|2|)>><wide|v|\<vect\>><rsub|k><rsup|<around*|(|2|)>T>.
</equation*>
Putting above expression into the iterative
eqns(<reference|iterative-eqns>), one can get
<\equation*>
<around*|(|H<rsub|k>+<wide|u|\<vect\>><rsub|k><rsup|<around*|(|1|)>><wide|v|\<vect\>><rsub|k><rsup|<around*|(|1|)>T>+<wide|u|\<vect\>><rsub|k><rsup|<around*|(|2|)>><wide|v|\<vect\>><rsub|k><rsup|<around*|(|2|)>T>|)>\<Delta\><wide|y|\<vect\>><rsub|k>=\<Delta\><wide|x|\<vect\>><rsub|k>,
</equation*>
or
<\equation*>
<around*|(|<wide|u|\<vect\>><rsub|k><rsup|<around*|(|1|)>><wide|v|\<vect\>><rsub|k><rsup|<around*|(|1|)>T>+<wide|u|\<vect\>><rsub|k><rsup|<around*|(|2|)>><wide|v|\<vect\>><rsub|k><rsup|<around*|(|2|)>T>|)>\<Delta\><wide|y|\<vect\>><rsub|k>=\<Delta\><wide|x|\<vect\>><rsub|k>-H<rsub|k>\<Delta\><wide|y|\<vect\>><rsub|k>.
</equation*>
First, let <math|<wide|u|\<vect\>><rsub|k><rsup|<around*|(|1|)>>=\<Delta\><wide|x|\<vect\>><rsub|k>>
and <math|<wide|u|\<vect\>><rsub|k><rsup|<around*|(|2|)>>=-H<rsub|k>\<Delta\><wide|y|\<vect\>><rsub|k>>,
then
<\equation*>
\<Delta\><wide|x|\<vect\>><rsub|k><wide|v|\<vect\>><rsub|k><rsup|<around*|(|1|)>T>\<Delta\><wide|y|\<vect\>><rsub|k>-H<rsub|k>\<Delta\><wide|y|\<vect\>><rsub|k><wide|v|\<vect\>><rsub|k><rsup|<around*|(|2|)>T>\<Delta\><wide|y|\<vect\>><rsub|k>=\<Delta\><wide|x|\<vect\>><rsub|k>-H<rsub|k>\<Delta\><wide|y|\<vect\>><rsub|k>.
</equation*>
Second, if we find a <math|<wide|v|\<vect\>><rsub|k><rsup|<around*|(|1|)>>,<wide|v|\<vect\>><rsub|k><rsup|<around*|(|2|)>>>
such that <math|<wide|v|\<vect\>><rsub|k><rsup|<around*|(|1|)>T>\<Delta\><wide|y|\<vect\>><rsub|k>=1=<wide|v|\<vect\>><rsub|k><rsup|<around*|(|2|)>T>\<Delta\><wide|y|\<vect\>><rsub|k>>,
then the iterative-eqns(<reference|iterative-eqns>) satisfied.
The following choice can satify iterative-eqns(<reference|iterative-eqns>):
<\eqnarray>
<tformat|<table|<row|<cell|<wide|v|\<vect\>><rsup|<around*|(|1|)>T>>|<cell|\<assign\>>|<cell|<around*|(|1+\<beta\>\<Delta\><wide|y|\<vect\>><rsub|k><rsup|T>H<rsub|k>\<Delta\><wide|y|\<vect\>><rsub|k>|)><frac|\<Delta\><wide|x|\<vect\>><rsub|k><rsup|T>|\<Delta\><wide|x|\<vect\>><rsub|k><rsup|T>\<Delta\><wide|y|\<vect\>><rsub|k>>-\<beta\>\<Delta\><wide|y|\<vect\>><rsub|k><rsup|T>H<rsub|k>,>>|<row|<cell|<wide|v|\<vect\>><rsup|<around*|(|2|)>T>>|<cell|\<assign\>>|<cell|<around*|(|1-\<beta\>\<Delta\><wide|x|\<vect\>><rsub|k><rsup|T>\<Delta\><wide|y|\<vect\>><rsub|k>|)><frac|\<Delta\><wide|y|\<vect\>><rsub|k><rsup|T>H<rsub|k>|\<Delta\><wide|y|\<vect\>><rsub|k><rsup|T>H<rsub|k>\<Delta\><wide|y|\<vect\>><rsub|k>>+\<beta\>\<Delta\><wide|x|\<vect\>><rsub|k><rsup|T>,>>>>
</eqnarray>
where <math|\<beta\>\<in\>\<bbb-R\>> is to be choosen and defines a rank-2
method.\
<strong|BFS method> chooses
<\equation*>
\<beta\>=<frac|1|\<Delta\><wide|x|\<vect\>><rsub|k><rsup|T>\<Delta\><wide|y|\<vect\>><rsub|k>>,
</equation*>
and the eqns(<reference|quasi-newton>) becomes
<\eqnarray>
<tformat|<table|<row|<cell|H<rsub|k+1>>|<cell|=>|<cell|H<rsub|k>+<frac|1|\<Delta\><wide|x|\<vect\>><rsub|k><rsup|T>\<Delta\><wide|y|\<vect\>><rsub|k>><around*|(|\<mu\><rsub|k>\<Delta\><wide|x|\<vect\>><rsub|k>\<Delta\><wide|x|\<vect\>><rsub|k><rsup|T>-H<rsub|k>\<Delta\><wide|y|\<vect\>><rsub|k>\<Delta\><wide|x|\<vect\>><rsub|k><rsup|T>-\<Delta\><wide|x|\<vect\>><rsub|k>\<Delta\><wide|y|\<vect\>><rsub|k><rsup|T>H<rsub|k>|)>,>>|<row|<cell|\<mu\><rsub|k>>|<cell|\<assign\>>|<cell|1+<frac|1|\<Delta\><wide|x|\<vect\>><rsub|k><rsup|T>\<Delta\><wide|y|\<vect\>><rsub|k>>\<Delta\><wide|y|\<vect\>><rsub|k><rsup|T>H<rsub|k>\<Delta\><wide|y|\<vect\>><rsub|k>.>>>>
</eqnarray>
By collecting these eqns with (<reference|quasi-newton>)
<\equation*>
<wide|x|\<vect\>><rsub|k+1>=<wide|x|\<vect\>><rsub|k>-H<rsub|k><wide|f|\<vect\>><around*|(|<wide|x|\<vect\>><rsub|k>|)>,
</equation*>
one get a quasi-Newton method, BFS method.
Note that one needs to give initial values <math|<wide|x|\<vect\>><rsub|0>>
and <math|H<rsub|0>> to start the iterations:
<math|<around*|(|<wide|x|\<vect\>><rsub|1>,\<Delta\><wide|x|\<vect\>><rsub|0>,\<Delta\><wide|y|\<vect\>><rsub|0>,H<rsub|1>|)>\<Rightarrow\><around*|(|<wide|x|\<vect\>><rsub|2>,\<Delta\><wide|x|\<vect\>><rsub|1>,\<Delta\><wide|y|\<vect\>><rsub|1>,H<rsub|2>|)>\<Rightarrow\>\<cdots\>>.
<\note>
(i) Qausi-Newton methods do not solve linear-eqns in each step, so these
methods would be much faster than Newton-Secant method.
(ii) <math|\<beta\>=0> defines DFP(Davidon, Flecher and Powell) method,
which is less stable than BFS method.
(iii) Quasi-Newton methods and Newton(-Sceant) method have a disadvantage
that it only can converge locally which means one needs to give good
initial-points to start iterating. Lenbenberg-Marquardt method is an
improvement of Newton(-Sceant) method that has good convergency property,
see the next subsubsection.
</note>
<subsubsection|Lenvenberg-Marquardt method>
Both Newton(-Sceant) method and quasi-Newton method face the convergence
problem, especially when the Jacobi(or quasi Jacobi) matrix is singular.
To avoid this problem, one can transforms solving eqns' problem into
finding the minimun value problem of the function:
<\equation*>
<below|min|d\<in\>\<bbb-R\><rsup|n>><frac|1|2><around*|\<\|\|\>|<wide|f|\<vect\>><rsub|k>+J<rsub|k><wide|d|\<vect\>>|\<\|\|\>><rsup|2>,
</equation*>
where <math|J<rsub|k>> is <math|<wide|f|\<vect\>><rprime|'><around*|(|<wide|x|\<vect\>><rsub|k>|)>>
in Newton method or <math|A<rsub|k>> in Newton-Sceant method. After finding
the minimun point <math|<wide|d|\<vect\>>>, one gets the next point
<math|<wide|x|\<vect\>><rsub|k+1>=<wide|x|\<vect\>><rsub|k>+<wide|d|\<vect\>>>.
If <math|J<rsub|k><rsup|T>J<rsub|k>> is non-degenerate, the solution is
<math|<wide|d|\<vect\>>=-<around*|(|J<rsub|k><rsup|T>J<rsub|k>|)><rsup|-1>J<rsub|k><rsup|T><wide|f|\<vect\>><rsub|k><around*|(|=-J<rsub|k><rsup|-1><wide|f|\<vect\>><rsub|k>|)>>,
and the next point is
<\equation*>
<wide|x|\<vect\>><rsub|k+1>=<wide|x|\<vect\>><rsub|k>-<around*|(|J<rsub|k><rsup|T>J<rsub|k>|)><rsup|-1>J<rsub|k><rsup|T><wide|f|\<vect\>><rsub|k>.
</equation*>
But if <math|J<rsub|k><rsup|T>J<rsub|k>> is degenerate, then one can
introduce a parameter <math|\<lambda\><rsub|k>\<gtr\>0> making
<math|J<rsub|k><rsup|T>J<rsub|k>+\<lambda\><rsub|k>E> non-degerate, and
consider the modified version, called <strong|Lenvenberg-Marquardt method>:
<\equation*>
<wide|x|\<vect\>><rsub|k+1>=<wide|x|\<vect\>><rsub|k>-<around*|(|J<rsub|k><rsup|T>J<rsub|k>+\<lambda\><rsub|k>E|)><rsup|-1>J<rsub|k><rsup|T><wide|f|\<vect\>><rsub|k>.
</equation*>
It has been proved(Tseng, etal) that if
<math|<wide|f|\<vect\>><around*|(|<wide|x|\<vect\>>|)>> satisfied locally
error-bounded, then above method can give the solution given
<math|\<lambda\><rsub|k>=<around*|\<\|\|\>|<wide|f|\<vect\>><rsub|k>|\<\|\|\>><rsup|2>>
(but <math|\<lambda\><rsub|k>=<around*|\<\|\|\>|<wide|f|\<vect\>><rsub|k>|\<\|\|\>>>
seems better [see Ref-3]).
First we give this Lenvenberg-Marquardt method where <math|J<rsub|k>>
defined by Sceant method(see Newton-Sceant method), then we give improved
version <em|trust region version of Lenvenberg-Marquardt method>.
<strong|Lenvenberg-Marquardt-Sceant method:>
In 2-point sequential sceant method(see the part of Newton-Sceant method),
suppose we have two solutions <math|<wide|x|\<vect\>>,<wide|y|\<vect\>>>,
the Jacobian <math|J> can be approximated by
<\equation*>
J\<simeq\>\<Gamma\>H<rsup|-1>,
</equation*>
where <math|\<Gamma\>> and <math|H> are defined in the following:
<\eqnarray>
<tformat|<table|<row|<cell|h<rsub|j>>|<cell|=>|<cell|<wide|x|\<vect\>><around*|[|j|]>-<wide|y|\<vect\>><around*|[|j|]>,j=1,\<cdots\>,n;>>|<row|<cell|<wide|\<xi\>|\<vect\>><rsub|j>>|<cell|=>|<cell|<wide|y|\<vect\>>+h<rsub|j><wide|e|\<vect\>><rsub|j>;>>|<row|<cell|H>|<cell|=>|<cell|<around*|[|<wide|\<xi\>|\<vect\>><rsub|1>-<wide|y|\<vect\>>,\<cdots\>,<wide|\<xi\>|\<vect\>><rsub|n>-<wide|y|\<vect\>>|]>=diag<around*|(|h<rsub|1>,\<cdots\>,h<rsub|n>|)>;>>|<row|<cell|\<Gamma\>>|<cell|=>|<cell|<around*|[|<wide|f|\<vect\>><around*|(|<wide|\<xi\>|\<vect\>><rsub|1>|)>-<wide|f|\<vect\>><around*|(|<wide|y|\<vect\>>|)>,\<cdots\>,<wide|f|\<vect\>><around*|(|<wide|\<xi\>|\<vect\>><rsub|n>|)>-<wide|f|\<vect\>><around*|(|<wide|y|\<vect\>>|)>|]>.>>>>
</eqnarray>
So the Lenvenberg-Marquardt method become:
(<math|<wide|x|\<vect\>>,<wide|y|\<vect\>>\<Rightarrow\><wide|z|\<vect\>>>)
<\eqnarray>
<tformat|<table|<row|<cell|<wide|z|\<vect\>>>|<cell|=>|<cell|<wide|y|\<vect\>>-H<wide|\<delta\>|\<vect\>>,where>>|<row|<cell|<around*|(|\<Gamma\><rsup|T>\<Gamma\>+<around*|\<\|\|\>|<wide|f|\<vect\>><around*|(|<wide|y|\<vect\>>|)>|\<\|\|\>>H<rsup|2>|)><wide|\<delta\>|\<vect\>>>|<cell|=>|<cell|\<Gamma\><rsup|T><wide|f|\<vect\>><around*|(|*<wide|y|\<vect\>>|)>.>>>>
</eqnarray>
Note that <math|<around*|(|\<Gamma\><rsup|T>\<Gamma\>+<around*|\<\|\|\>|<wide|f|\<vect\>><around*|(|<wide|y|\<vect\>>|)>|\<\|\|\>>H<rsup|2>|)>>
is symmetric-positive matrix, which means Gauss-Siedel method can be used
directly or modified-Least-square method.
\;
\;
<section|Ordinary differential eqns: initial-value problems>
<strong|Problem:> Solve <math|<wide|x|\<vect\>><around*|(|t|)>> of odes
with initial values
<\eqnarray>
<tformat|<table|<row|<cell|<frac|d<wide|x|\<vect\>>|d
t>>|<cell|=>|<cell|<wide|f|\<vect\>><around*|(|t,<wide|x|\<vect\>>|)>,>>|<row|<cell|<wide|x|\<vect\>><around*|(|t<rsub|0>|)>>|<cell|=>|<cell|<wide|\<eta\>|\<vect\>>.>>>>
</eqnarray>
In numerical method, one can not get the solution as a function, but only
has a sequence of <math|<wide|x|\<vect\>>> at some discrete times:
<\equation*>
<around*|(|t<rsub|0>,<wide|\<eta\>|\<vect\>>|)>,<around*|(|t<rsub|1>,<wide|x|\<vect\>><rsub|1>|)>,\<cdots\>,<around*|(|t<rsub|n>,<wide|x|\<vect\>><rsub|n>|)>,\<cdots\>.
</equation*>
To get this seq, it is often to find an iterative relation between
<math|<wide|x|\<vect\>><rsub|n>> with <math|<wide|x|\<vect\>><rsub|n-1>,\<cdots\>,<wide|x|\<vect\>><rsub|n-p>>,
where the case of <math|p=1> is called <em|one-step method>, while otheres
are called <em|multistep method>.
The most famous one-step method is the <strong|Runge-Kutta method>, while
the most famous multi-sptes method is <strong|Adams method>.
On the other hand, the iterative realtion may be explicit and implicit, so
these methods also can be classified into <em|explicit-form method> and
<em|implicit-form method>.
The advanges and disadvanges of each method will be talked in the following
parts.
<subsection|One-step methods: Runge-Kutta method>
A one-step method is to find a iterative relation as (let
<math|h<rsub|n>=t<rsub|n+1>-t<rsub|n>>),\
(explicit-form method)
<\equation*>
<wide|x|\<vect\>><rsub|n+1>=<wide|x|\<vect\>><rsub|n>+h<rsub|n><wide|\<Phi\>|\<vect\>><around*|(|t<rsub|n>,<wide|x|\<vect\>><rsub|n>,h<rsub|n>|)>,
</equation*>
or (implicit-form method)
<\equation*>
<wide|x|\<vect\>><rsub|n+1>=<wide|x|\<vect\>><rsub|n>+h<rsub|n><wide|\<Phi\>|\<vect\>><around*|(|t<rsub|n>,<wide|x|\<vect\>><rsub|n>,<wide|x|\<vect\>><rsub|n+1>,h<rsub|n>|)>.
</equation*>
Runge-Kutta method is one-step method which can be derived by Taylor
expansion.
We take the 2nd-order RK method as an example, and higher order RK method
can be derived in the same way but is much more complicate.
Consider
<\eqnarray>
<tformat|<table|<row|<cell|<wide|x|\<vect\>><around*|(|t+h|)>>|<cell|=>|<cell|<wide|x|\<vect\>><around*|(|t|)>+h<wide|<wide|x|\<vect\>>|\<dot\>><around*|(|t|)>+<frac|h<rsup|2>|2!><wide|<wide|x|\<vect\>>|\<ddot\>><around*|(|t|)>+O<around*|(|h<rsup|3>|)>>>|<row|<cell|>|<cell|=>|<cell|<wide|x|\<vect\>><around*|(|t|)>+h<wide|f|\<vect\>><around*|(|t,<wide|x|\<vect\>><around*|(|t|)>|)>+<frac|h<rsup|2>|2!><wide|<wide|f|\<vect\>>|\<dot\>><around*|(|t,<wide|x|\<vect\>><around*|(|t|)>|)>+O<around*|(|h<rsup|3>|)>>>|<row|<cell|>|<cell|=>|<cell|<wide|x|\<vect\>><around*|(|t|)>+h<wide|f|\<vect\>><around*|(|t,<wide|x|\<vect\>><around*|(|t|)>|)>+<frac|h<rsup|2>|2!><around*|(|\<partial\><rsub|t><wide|f|\<vect\>>+<wide|<wide|x|\<vect\>>|\<dot\>>\<cdot\>\<partial\><rsub|<wide|x|\<vect\>>><wide|f|\<vect\>>|)><around*|\|||\<nobracket\>><rsub|<around*|(|t,<wide|x|\<vect\>>|)>>+O<around*|(|h<rsup|3>|)>>>|<row|<cell|>|<cell|=>|<cell|<wide|x|\<vect\>><around*|(|t|)>+h<wide|f|\<vect\>><around*|(|t,<wide|x|\<vect\>><around*|(|t|)>|)>+<frac|h<rsup|2>|2!><around*|(|\<partial\><rsub|t><wide|f|\<vect\>>+<wide|f|\<vect\>>\<cdot\>\<partial\><rsub|<wide|x|\<vect\>>><wide|f|\<vect\>>|)><around*|\|||\<nobracket\>><rsub|<around*|(|t,<wide|x|\<vect\>>|)>>+O<around*|(|h<rsup|3>|)>>>>>
</eqnarray>
this gives the iterative-function to 2nd-order:
<\equation*>
<wide|\<Phi\>|\<vect\>><around*|(|t,<wide|x|\<vect\>>,h|)>=<wide|f|\<vect\>>+<frac|h|2><around*|(|\<partial\><rsub|t><wide|f|\<vect\>>+<wide|f|\<vect\>>\<cdot\>\<partial\><rsub|<wide|x|\<vect\>>><wide|f|\<vect\>>|)>.
</equation*>
The key point is that we can use some points that can replace the
derivative, i.e., suppose that
<\equation*>
<wide|\<Phi\>|\<vect\>><around*|(|t,<wide|x|\<vect\>>,h|)>=c<rsub|1><wide|K|\<vect\>><rsub|1>+c<rsub|2><wide|K|\<vect\>><rsub|2>,
</equation*>
where
<\eqnarray>
<tformat|<table|<row|<cell|<wide|K|\<vect\>><rsub|1>>|<cell|=>|<cell|<wide|f|\<vect\>><around*|(|t,<wide|x|\<vect\>>|)>,>>|<row|<cell|<wide|K|\<vect\>><rsub|2>>|<cell|=>|<cell|<wide|f|\<vect\>><around*|(|t+a<rsub|2>h,<wide|x|\<vect\>>+b<rsub|21>h
<wide|K|\<vect\>><rsub|1>|)>>>|<row|<cell|>|<cell|=>|<cell|<wide|K|\<vect\>><rsub|1>+a<rsub|2>h\<partial\><rsub|t><wide|f|\<vect\>><around*|\||<rsub|<around*|(|t,<wide|x|\<vect\>>|)>>|\<nobracket\>>+b<rsub|21>h<wide|K|\<vect\>><rsub|1>\<cdot\>\<partial\><rsub|<wide|x|\<vect\>>><wide|f|\<vect\>><around*|\||<rsub|<around*|(|t,<wide|x|\<vect\>>|)>>|\<nobracket\>>+O<around*|(|h<rsup|2>|)>>>|<row|<cell|>|<cell|=>|<cell|<wide|K|\<vect\>><rsub|1>+a<rsub|2>h\<partial\><rsub|t><wide|f|\<vect\>><around*|\||<rsub|<around*|(|t,<wide|x|\<vect\>>|)>>|\<nobracket\>>+b<rsub|21>h<around*|(|<wide|f|\<vect\>>\<cdot\>\<partial\><rsub|<wide|x|\<vect\>>><wide|f|\<vect\>>|)><around*|\||<rsub|<around*|(|t,<wide|x|\<vect\>>|)>>|\<nobracket\>>+O<around*|(|h<rsup|2>|)>>>>>
</eqnarray>
So it requires
<\equation*>
<choice|<tformat|<table|<row|<cell|c<rsub|1>+c<rsub|2>=1,>>|<row|<cell|c<rsub|2>a<rsub|2>=<frac|1|2>,>>|<row|<cell|c<rsub|2>b<rsub|21>=<frac|1|2>.>>>>>
</equation*>
There are many choices(4 variables with 3 eqns) of <math|c,a,b>s to satisfy
these requirments:
(i) <math|c<rsub|1>=c<rsub|2>=<frac|1|2>,a<rsub|2>=b<rsub|21>=1> gives
improved-Euler method;
(ii) <math|c<rsub|1>=0,c<rsub|2>=1,a<rsub|2>=b<rsub|21>=<frac|1|2>> gives
midpoint method;
(iii) <math|c<rsub|1>=<frac|1|4>,c<rsub|2>=<frac|3|4>,a<rsub|2>=b<rsub|21>=<frac|2|3>>
gives Heun method, and so on.
Here we give the <strong|explicit 4-level 4th-order Runge-Kutta method>
which is often used:
<\eqnarray>
<tformat|<table|<row|<cell|<wide|x|\<vect\>><rsub|n+1>>|<cell|=>|<cell|<wide|x|\<vect\>><rsub|n>+<frac|h|6><around*|(|<wide|K|\<vect\>><rsub|1>+2<wide|K|\<vect\>><rsub|2>+2<wide|K|\<vect\>><rsub|3>+<wide|K|\<vect\>><rsub|4>|)>,>>|<row|<cell|<wide|K|\<vect\>><rsub|1>>|<cell|=>|<cell|<wide|f|\<vect\>><around*|(|t<rsub|n>,<wide|x|\<vect\>><rsub|n>|)>,>>|<row|<cell|<wide|K|\<vect\>><rsub|2>>|<cell|=>|<cell|<wide|f|\<vect\>><around*|(|t<rsub|n>+<frac|1|2>h,<wide|x|\<vect\>><rsub|n>+<frac|1|2>h<wide|K|\<vect\>><rsub|1>|)>,>>|<row|<cell|<wide|K|\<vect\>><rsub|3>>|<cell|=>|<cell|<wide|f|\<vect\>><around*|(|t<rsub|n>+<frac|1|2>h,<wide|x|\<vect\>><rsub|n>+<frac|1|2>h<wide|K|\<vect\>><rsub|2>|)>,>>|<row|<cell|<wide|K|\<vect\>><rsub|4>>|<cell|=>|<cell|<wide|f|\<vect\>><around*|(|t<rsub|n>+h,<wide|x|\<vect\>><rsub|n>+h<wide|K|\<vect\>><rsub|3>|)>.>>>>
</eqnarray>
[4-level means there are 4 <math|<wide|K|\<vect\>><rsub|i>>s are used,
4th-order means the error is of <math|o<around*|(|h<rsup|4>|)>>.]
Note that the step <math|h> above is given to be constant; but we know that
the steps can be choosen as one likes.
Here we give an adapted-steps Runge-Kutta method, called
<strong|Rugge-Kutta-Fehlberg mtheod>, which is used widely:
Fehlberg(1970) found that there is a choice of 4th-order RK method shares
<math|K<rsub|i>>s with 5th-order RK method, then he can use 5th-order RK
solution to control 4th-order RK soltuion by adapting the steps without
needing much more <math|K<rsub|i>>s. The result is as follows:
**<strong|RKF-method>**
4th order rk:
<\equation*>
<wide|x|\<vect\>><rsub|n+1>=<wide|x|\<vect\>><rsub|n>+<frac|25|216><wide|K|\<vect\>><rsub|1>+<frac|1408|2565><wide|K|\<vect\>><rsub|3>+<frac|2197|4104><wide|K|\<vect\>><rsub|4>-<frac|1|5><wide|K|\<vect\>><rsub|5>,
</equation*>
5th order rk:
<\equation*>
<wide|\<xi\>|\<vect\>><rsub|n+1>=<wide|x|\<vect\>><rsub|n>+<frac|16|135><wide|K|\<vect\>><rsub|1>+<frac|6656|12825><wide|K|\<vect\>><rsub|3>+<frac|28561|56430><wide|K|\<vect\>><rsub|4>-<frac|9|50><wide|K|\<vect\>><rsub|5>+<frac|2|55><wide|K|\<vect\>><rsub|6>,
</equation*>
where
<\eqnarray>
<tformat|<table|<row|<cell|<wide|K|\<vect\>><rsub|1>>|<cell|=>|<cell|h<wide|f|\<vect\>><around*|(|t<rsub|n>,<wide|x|\<vect\>><rsub|n>|)>,>>|<row|<cell|<wide|K|\<vect\>><rsub|2>>|<cell|=>|<cell|h<wide|f|\<vect\>><around*|(|t<rsub|n>+<frac|h|4>,<wide|x|\<vect\>><rsub|n>+<frac|1|4><wide|K|\<vect\>><rsub|1>|)>,>>|<row|<cell|<wide|K|\<vect\>><rsub|3>>|<cell|=>|<cell|h<wide|f|\<vect\>><around*|(|t<rsub|n>+<frac|3|8>h,<wide|x|\<vect\>><rsub|n>+<frac|3|32><wide|K|\<vect\>><rsub|1>+<frac|9|32><wide|K|\<vect\>><rsub|2>|)>,>>|<row|<cell|<wide|K|\<vect\>><rsub|4>>|<cell|=>|<cell|h<wide|f|\<vect\>><around*|(|t<rsub|n>+<frac|12|13>h,<wide|x|\<vect\>><rsub|n>+<frac|1932|2197><wide|K|\<vect\>><rsub|1>-<frac|7200|2197><wide|K|\<vect\>><rsub|2>+<frac|7296|2197><wide|K|\<vect\>><rsub|3>|)>,>>|<row|<cell|<wide|K|\<vect\>><rsub|5>>|<cell|=>|<cell|h<wide|f|\<vect\>><around*|(|t<rsub|n>+h,<wide|x|\<vect\>><rsub|n>+<frac|439|216><wide|K|\<vect\>><rsub|1>-8<wide|K|\<vect\>><rsub|2>+<frac|3680|513><wide|K|\<vect\>><rsub|3>-<frac|845|4104><wide|K|\<vect\>><rsub|4>|)>,>>|<row|<cell|<wide|K|\<vect\>><rsub|6>>|<cell|=>|<cell|h<wide|f|\<vect\>><around*|(|t<rsub|n>+<frac|h|2>,<wide|x|\<vect\>><rsub|n>-<frac|8|27><wide|K|\<vect\>><rsub|1>+2<wide|K|\<vect\>><rsub|2>-<frac|3544|2565><wide|K|\<vect\>><rsub|3>-<frac|1859|4104><wide|K|\<vect\>><rsub|4>-<frac|11|40><wide|K|\<vect\>><rsub|5>|)>.>>>>
</eqnarray>
The error <math|\<tau\><rsub|n+1>=<frac|<around*|\||<wide|x|\<vect\>><around*|(|t<rsub|n+1>|)>-<wide|x|\<vect\>><rsub|n+1>|\|>|h>\<sim\>\<lambda\>h<rsup|4>>
can be approximated by <math|<around*|\||<wide|\<xi\>|\<vect\>><rsub|n+1>-<wide|x|\<vect\>><rsub|n+1>|\|>/h>,
so when error larger than <math|TOL>, i.e.
<math|<frac|<around*|\||<wide|\<xi\>|\<vect\>><rsub|n+1>-<wide|x|\<vect\>><rsub|n+1>|\|>|h>\<gtr\>TOL>,
we can take a smaller step <math|<wide|h|~>\<equiv\>q h>, with
<math|q=<around*|[|<frac|TOL|<around*|\||<wide|\<xi\>|\<vect\>><rsub|n+1>-<wide|x|\<vect\>><rsub|n+1>|\|>/h>|]><rsup|1/4>>,
which in practice, is taken as
<\equation*>
q=<around*|[|<frac|TOL|2<around*|\||<wide|\<xi\>|\<vect\>><rsub|n+1>-<wide|x|\<vect\>><rsub|n+1>|\|>/h>|]><rsup|1/4>\<simeq\>0.84<around*|[|<frac|TOL|R>|]><rsup|1/4>,
</equation*>
where <math|R> is the current error
<\equation*>
R\<equiv\><frac|<around*|\||<wide|\<xi\>|\<vect\>><rsub|n+1>-<wide|x|\<vect\>><rsub|n+1>|\|>|h>=<frac|1|h><around*|\||<frac|1|360><wide|K|\<vect\>><rsub|1>-<frac|128|4275><wide|K|\<vect\>><rsub|3>-<frac|2197|75240><wide|K|\<vect\>><rsub|4>+<frac|1|50><wide|K|\<vect\>><rsub|5>+<frac|2|55><wide|K|\<vect\>><rsub|6>|\|>.
</equation*>
**<strong|implicit-RK-method>**
Implicit one-step method is that the iterative-fun
<math|<wide|\<Phi\>|\<vect\>>> is also a function of
<math|<wide|x|\<vect\>><rsub|n+1>>, here we only give one method of
RK-method, 1-level 2th-order method:
<\eqnarray>
<tformat|<table|<row|<cell|<wide|x|\<vect\>><rsub|n+1>>|<cell|=>|<cell|<wide|x|\<vect\>><rsub|n>+<wide|K|\<vect\>><rsub|1>,>>|<row|<cell|<wide|K|\<vect\>><rsub|1>>|<cell|=>|<cell|h<wide|f|\<vect\>><around*|(|t<rsub|n>+<frac|h|2>,<wide|x|\<vect\>><rsub|n>+<frac|1|2><wide|K|\<vect\>><rsub|1>|)>.>>>>
</eqnarray>
Note that to get <math|<wide|x|\<vect\>><rsub|n+1>>, one needs to solve
<math|<wide|K|\<vect\>><rsub|1>> in the second eqns which is a non-linear
eqns. Implicit methods are used to solve some special eqns, like stiff
odes.
<subsection|Multi-steps methods: Adams methods>
xxxxxxxx
\;
<section|Monte Carlo methods>
Monte Carlo methods are a large class of methods which use statiscal method
to address numerical mathematical problem. They are shown in a separate
note, see <verbatim|./monte-carlo-methods/monte-carlo-methods.tm(pdf)>.
\;
</body>
<\initial>
<\collection>
<associate|page-medium|papyrus>
</collection>
</initial>
<\references>
<\collection>
<associate|Gauss-Siedel-iteration-func|<tuple|13|23>>
<associate|Jacobi-iteration-func|<tuple|12|23>>
<associate|Newton|<tuple|9|5>>
<associate|Newton-Sceant-multivars|<tuple|15|26>>
<associate|Newton-method-multivars|<tuple|14|25>>
<associate|SD|<tuple|8|5>>
<associate|armijo-condition|<tuple|4|2>>
<associate|auto-1|<tuple|1|2>>
<associate|auto-10|<tuple|4|10>>
<associate|auto-11|<tuple|2.3|10>>
<associate|auto-12|<tuple|2.3.1|12>>
<associate|auto-13|<tuple|2.3.2|12>>
<associate|auto-14|<tuple|5|13>>
<associate|auto-15|<tuple|2.4|14>>
<associate|auto-16|<tuple|2.5|15>>
<associate|auto-17|<tuple|2.5.1|15>>
<associate|auto-18|<tuple|2.5.2|17>>
<associate|auto-19|<tuple|3|18>>
<associate|auto-2|<tuple|2|2>>
<associate|auto-20|<tuple|3.1|18>>
<associate|auto-21|<tuple|3.1.1|18>>
<associate|auto-22|<tuple|3.1.2|19>>
<associate|auto-23|<tuple|3.1.3|19>>
<associate|auto-24|<tuple|3.1.4|20>>
<associate|auto-25|<tuple|3.2|21>>
<associate|auto-26|<tuple|3.2.1|21>>
<associate|auto-27|<tuple|3.2.2|22>>
<associate|auto-28|<tuple|3.2.3|23>>
<associate|auto-29|<tuple|3.2.4|23>>
<associate|auto-3|<tuple|2.1|2>>
<associate|auto-30|<tuple|3.2.5|24>>
<associate|auto-31|<tuple|3.2.6|24>>
<associate|auto-32|<tuple|3.2.7|25>>
<associate|auto-33|<tuple|3.3|25>>
<associate|auto-34|<tuple|3.3.1|25>>
<associate|auto-35|<tuple|3.3.2|26>>
<associate|auto-36|<tuple|3.3.3|27>>
<associate|auto-37|<tuple|3.3.4|29>>
<associate|auto-38|<tuple|4|30>>
<associate|auto-39|<tuple|4.1|30>>
<associate|auto-4|<tuple|2.2|2>>
<associate|auto-40|<tuple|4.2|32>>
<associate|auto-41|<tuple|5|32>>
<associate|auto-5|<tuple|2.2.1|2>>
<associate|auto-6|<tuple|1|3>>
<associate|auto-7|<tuple|2.2.2|5>>
<associate|auto-8|<tuple|2|6>>
<associate|auto-9|<tuple|3|9>>
<associate|cc|<tuple|7|3>>
<associate|curvature-condition|<tuple|5|2>>
<associate|descent-condtion|<tuple|3|2>>
<associate|iteration-eqns-linear|<tuple|11|22>>
<associate|iteration-relation-of-optimization|<tuple|2|2>>
<associate|iterative-eqns|<tuple|17|27>>
<associate|linear-approx|<tuple|16|27>>
<associate|optimization-problem|<tuple|1|2>>
<associate|quasi-newton|<tuple|18|27>>
<associate|sdc|<tuple|6|3>>
<associate|solve-distance|<tuple|10|11>>
</collection>
</references>
<\auxiliary>
<\collection>
<\associate|figure>
<tuple|normal|<\surround|<hidden-binding|<tuple>|1>|>
Region for <with|mode|<quote|math>|s<rsub|i+1>>: sdc(light-green) and
cc(deep-green).
</surround>|<pageref|auto-6>>
<tuple|normal|<\surround|<hidden-binding|<tuple>|2>|>
Subspace method in line-search algorithm [note that
<with|mode|<quote|math>|<frac|\<partial\>f|\<partial\>d<rsub|0>><around*|\||<rsub|x<rsub|1>>|\<nobracket\>>=<with|font-series|<quote|bold>|d><rsub|0>\<cdot\><with|font-series|<quote|bold>|g><rsub|1>\<sim\>0>]
</surround>|<pageref|auto-8>>
<tuple|normal|<\surround|<hidden-binding|<tuple>|3>|>
\;
</surround>|<pageref|auto-9>>
<tuple|normal|<\surround|<hidden-binding|<tuple>|4>|>
<with|mode|<quote|math>|<around*|\<\|\|\>|x<rsub|<wide|t|^>>-x<rsub|2>|\<\|\|\>>=\<delta\><around*|\<\|\|\>|\<Delta\>x|\<\|\|\>>>.
</surround>|<pageref|auto-10>>
<tuple|normal|<\surround|<hidden-binding|<tuple>|5>|>
Three cases of <with|mode|<quote|math>|m<around*|(|\<mu\>,\<nu\>|)>>:
a) <with|mode|<quote|math>|m<around*|(|\<mu\>,\<nu\>|)>=<frac|1|2><around*|(|\<mu\><rsup|2>+<frac|\<nu\><rsup|2>|0.1<rsup|2>>|)>>;
b) <with|mode|<quote|math>|m<around*|(|\<mu\>,\<nu\>|)>=<frac|1|2><around*|(|-\<mu\><rsup|2>+<frac|\<nu\><rsup|2>|0.1<rsup|2>>|)>>;
c) <with|mode|<quote|math>|m<around*|(|\<mu\>,\<nu\>|)>=\<nu\><rsup|2>-\<mu\>>.
Note: the best solution in the region
<with|mode|<quote|math>|\<Delta\><rsub|1>> centered by
<with|mode|<quote|math>|x<rsub|1>> is
<with|mode|<quote|math>|x<rsub|2>> not
<with|mode|<quote|math>|x<rsub|2><rprime|'>>.
</surround>|<pageref|auto-14>>
</associate>
<\associate|toc>
<vspace*|1fn><with|font-series|<quote|bold>|math-font-series|<quote|bold>|1<space|2spc>Introduction>
<datoms|<macro|x|<repeat|<arg|x>|<with|font-series|medium|<with|font-size|1|<space|0.2fn>.<space|0.2fn>>>>>|<htab|5mm>>
<no-break><pageref|auto-1><vspace|0.5fn>
<vspace*|1fn><with|font-series|<quote|bold>|math-font-series|<quote|bold>|2<space|2spc>Optimization
problems> <datoms|<macro|x|<repeat|<arg|x>|<with|font-series|medium|<with|font-size|1|<space|0.2fn>.<space|0.2fn>>>>>|<htab|5mm>>
<no-break><pageref|auto-2><vspace|0.5fn>
<with|par-left|<quote|1tab>|2.1<space|2spc>Unconstrained optimization
problem <datoms|<macro|x|<repeat|<arg|x>|<with|font-series|medium|<with|font-size|1|<space|0.2fn>.<space|0.2fn>>>>>|<htab|5mm>>
<no-break><pageref|auto-3>>
<with|par-left|<quote|1tab>|2.2<space|2spc>Line search approach
<datoms|<macro|x|<repeat|<arg|x>|<with|font-series|medium|<with|font-size|1|<space|0.2fn>.<space|0.2fn>>>>>|<htab|5mm>>
<no-break><pageref|auto-4>>
<with|par-left|<quote|2tab>|2.2.1<space|2spc>Line search algorithm
<datoms|<macro|x|<repeat|<arg|x>|<with|font-series|medium|<with|font-size|1|<space|0.2fn>.<space|0.2fn>>>>>|<htab|5mm>>
<no-break><pageref|auto-5>>
<with|par-left|<quote|2tab>|2.2.2<space|2spc>Different methods to
descent direction <datoms|<macro|x|<repeat|<arg|x>|<with|font-series|medium|<with|font-size|1|<space|0.2fn>.<space|0.2fn>>>>>|<htab|5mm>>
<no-break><pageref|auto-7>>
<with|par-left|<quote|1tab>|2.3<space|2spc>Trust region approach
<datoms|<macro|x|<repeat|<arg|x>|<with|font-series|medium|<with|font-size|1|<space|0.2fn>.<space|0.2fn>>>>>|<htab|5mm>>
<no-break><pageref|auto-11>>
<with|par-left|<quote|2tab>|2.3.1<space|2spc>Newton, Quasi-Newton, LMF
and CG methods <datoms|<macro|x|<repeat|<arg|x>|<with|font-series|medium|<with|font-size|1|<space|0.2fn>.<space|0.2fn>>>>>|<htab|5mm>>
<no-break><pageref|auto-12>>
<with|par-left|<quote|2tab>|2.3.2<space|2spc>Subspace method
<datoms|<macro|x|<repeat|<arg|x>|<with|font-series|medium|<with|font-size|1|<space|0.2fn>.<space|0.2fn>>>>>|<htab|5mm>>
<no-break><pageref|auto-13>>
<with|par-left|<quote|1tab>|2.4<space|2spc>Strict-inequality
constrained optimization problem <datoms|<macro|x|<repeat|<arg|x>|<with|font-series|medium|<with|font-size|1|<space|0.2fn>.<space|0.2fn>>>>>|<htab|5mm>>
<no-break><pageref|auto-15>>
<with|par-left|<quote|1tab>|2.5<space|2spc>Constrained optimization
problem <datoms|<macro|x|<repeat|<arg|x>|<with|font-series|medium|<with|font-size|1|<space|0.2fn>.<space|0.2fn>>>>>|<htab|5mm>>
<no-break><pageref|auto-16>>
<with|par-left|<quote|2tab>|2.5.1<space|2spc>penalty function method
<datoms|<macro|x|<repeat|<arg|x>|<with|font-series|medium|<with|font-size|1|<space|0.2fn>.<space|0.2fn>>>>>|<htab|5mm>>
<no-break><pageref|auto-17>>
<with|par-left|<quote|2tab>|2.5.2<space|2spc>Augmented Lagrange penalty
function method <datoms|<macro|x|<repeat|<arg|x>|<with|font-series|medium|<with|font-size|1|<space|0.2fn>.<space|0.2fn>>>>>|<htab|5mm>>
<no-break><pageref|auto-18>>
<vspace*|1fn><with|font-series|<quote|bold>|math-font-series|<quote|bold>|3<space|2spc>Algebraic
equations> <datoms|<macro|x|<repeat|<arg|x>|<with|font-series|medium|<with|font-size|1|<space|0.2fn>.<space|0.2fn>>>>>|<htab|5mm>>
<no-break><pageref|auto-19><vspace|0.5fn>
<with|par-left|<quote|1tab>|3.1<space|2spc>Single variable eqn
<datoms|<macro|x|<repeat|<arg|x>|<with|font-series|medium|<with|font-size|1|<space|0.2fn>.<space|0.2fn>>>>>|<htab|5mm>>
<no-break><pageref|auto-20>>
<with|par-left|<quote|2tab>|3.1.1<space|2spc>Bisection method
<datoms|<macro|x|<repeat|<arg|x>|<with|font-series|medium|<with|font-size|1|<space|0.2fn>.<space|0.2fn>>>>>|<htab|5mm>>
<no-break><pageref|auto-21>>
<with|par-left|<quote|2tab>|3.1.2<space|2spc>Iteration-function method
<datoms|<macro|x|<repeat|<arg|x>|<with|font-series|medium|<with|font-size|1|<space|0.2fn>.<space|0.2fn>>>>>|<htab|5mm>>
<no-break><pageref|auto-22>>
<with|par-left|<quote|2tab>|3.1.3<space|2spc>Mixed method
<datoms|<macro|x|<repeat|<arg|x>|<with|font-series|medium|<with|font-size|1|<space|0.2fn>.<space|0.2fn>>>>>|<htab|5mm>>
<no-break><pageref|auto-23>>
<with|par-left|<quote|2tab>|3.1.4<space|2spc>Accelerating iteration:
Aitken method and it's super version
<datoms|<macro|x|<repeat|<arg|x>|<with|font-series|medium|<with|font-size|1|<space|0.2fn>.<space|0.2fn>>>>>|<htab|5mm>>
<no-break><pageref|auto-24>>
<with|par-left|<quote|1tab>|3.2<space|2spc>Linear eqns
<datoms|<macro|x|<repeat|<arg|x>|<with|font-series|medium|<with|font-size|1|<space|0.2fn>.<space|0.2fn>>>>>|<htab|5mm>>
<no-break><pageref|auto-25>>
<with|par-left|<quote|2tab>|3.2.1<space|2spc>Direct methods
<datoms|<macro|x|<repeat|<arg|x>|<with|font-series|medium|<with|font-size|1|<space|0.2fn>.<space|0.2fn>>>>>|<htab|5mm>>
<no-break><pageref|auto-26>>
<with|par-left|<quote|2tab>|3.2.2<space|2spc>Iterative methods
<datoms|<macro|x|<repeat|<arg|x>|<with|font-series|medium|<with|font-size|1|<space|0.2fn>.<space|0.2fn>>>>>|<htab|5mm>>
<no-break><pageref|auto-27>>
<with|par-left|<quote|2tab>|3.2.3<space|2spc>Jacobi method
<datoms|<macro|x|<repeat|<arg|x>|<with|font-series|medium|<with|font-size|1|<space|0.2fn>.<space|0.2fn>>>>>|<htab|5mm>>
<no-break><pageref|auto-28>>
<with|par-left|<quote|2tab>|3.2.4<space|2spc>Gauss-Siedel method
<datoms|<macro|x|<repeat|<arg|x>|<with|font-series|medium|<with|font-size|1|<space|0.2fn>.<space|0.2fn>>>>>|<htab|5mm>>
<no-break><pageref|auto-29>>
<with|par-left|<quote|2tab>|3.2.5<space|2spc>Successive-Over-Relaxation(SOR)
method <datoms|<macro|x|<repeat|<arg|x>|<with|font-series|medium|<with|font-size|1|<space|0.2fn>.<space|0.2fn>>>>>|<htab|5mm>>
<no-break><pageref|auto-30>>
<with|par-left|<quote|2tab>|3.2.6<space|2spc>Aitken method and its
super version <datoms|<macro|x|<repeat|<arg|x>|<with|font-series|medium|<with|font-size|1|<space|0.2fn>.<space|0.2fn>>>>>|<htab|5mm>>
<no-break><pageref|auto-31>>
<with|par-left|<quote|2tab>|3.2.7<space|2spc>The convergency problem
<datoms|<macro|x|<repeat|<arg|x>|<with|font-series|medium|<with|font-size|1|<space|0.2fn>.<space|0.2fn>>>>>|<htab|5mm>>
<no-break><pageref|auto-32>>
<with|par-left|<quote|1tab>|3.3<space|2spc>Non-linear eqns
<datoms|<macro|x|<repeat|<arg|x>|<with|font-series|medium|<with|font-size|1|<space|0.2fn>.<space|0.2fn>>>>>|<htab|5mm>>
<no-break><pageref|auto-33>>
<with|par-left|<quote|2tab>|3.3.1<space|2spc>Newton method
<datoms|<macro|x|<repeat|<arg|x>|<with|font-series|medium|<with|font-size|1|<space|0.2fn>.<space|0.2fn>>>>>|<htab|5mm>>
<no-break><pageref|auto-34>>
<with|par-left|<quote|2tab>|3.3.2<space|2spc>Newton-Sceant method
<datoms|<macro|x|<repeat|<arg|x>|<with|font-series|medium|<with|font-size|1|<space|0.2fn>.<space|0.2fn>>>>>|<htab|5mm>>
<no-break><pageref|auto-35>>
<with|par-left|<quote|2tab>|3.3.3<space|2spc>Quasi-Newton method
<datoms|<macro|x|<repeat|<arg|x>|<with|font-series|medium|<with|font-size|1|<space|0.2fn>.<space|0.2fn>>>>>|<htab|5mm>>
<no-break><pageref|auto-36>>
<with|par-left|<quote|2tab>|3.3.4<space|2spc>Lenvenberg-Marquardt
method <datoms|<macro|x|<repeat|<arg|x>|<with|font-series|medium|<with|font-size|1|<space|0.2fn>.<space|0.2fn>>>>>|<htab|5mm>>
<no-break><pageref|auto-37>>
<vspace*|1fn><with|font-series|<quote|bold>|math-font-series|<quote|bold>|4<space|2spc>Ordinary
differential eqns: initial-value problems>
<datoms|<macro|x|<repeat|<arg|x>|<with|font-series|medium|<with|font-size|1|<space|0.2fn>.<space|0.2fn>>>>>|<htab|5mm>>
<no-break><pageref|auto-38><vspace|0.5fn>
<with|par-left|<quote|1tab>|4.1<space|2spc>One-step methods:
Runge-Kutta method <datoms|<macro|x|<repeat|<arg|x>|<with|font-series|medium|<with|font-size|1|<space|0.2fn>.<space|0.2fn>>>>>|<htab|5mm>>
<no-break><pageref|auto-39>>
<with|par-left|<quote|1tab>|4.2<space|2spc>Multi-steps methods: Adams
methods <datoms|<macro|x|<repeat|<arg|x>|<with|font-series|medium|<with|font-size|1|<space|0.2fn>.<space|0.2fn>>>>>|<htab|5mm>>
<no-break><pageref|auto-40>>
<vspace*|1fn><with|font-series|<quote|bold>|math-font-series|<quote|bold>|5<space|2spc>Monte
Carlo methods> <datoms|<macro|x|<repeat|<arg|x>|<with|font-series|medium|<with|font-size|1|<space|0.2fn>.<space|0.2fn>>>>>|<htab|5mm>>
<no-break><pageref|auto-41><vspace|0.5fn>
</associate>
</collection>
</auxiliary>
马建仓 AI 助手
尝试更多
代码解读
代码找茬
代码优化
Scheme
1
https://gitee.com/cl-phy/numerical-analysis.git
git@gitee.com:cl-phy/numerical-analysis.git
cl-phy
numerical-analysis
numerical-analysis
master

搜索帮助