您好,欢迎来到尚车旅游网。
搜索
您的当前位置:首页Stochastic logic programs Sampling, inference and applications

Stochastic logic programs Sampling, inference and applications

来源:尚车旅游网
JamesCussens

DepartmentofComputerScience,UniversityofYork

Heslington,York,YO105DD,UK

jc@cs.york.ac.uk

Abstract

Algorithmsforexactandapproximateinferenceinstochasticlogicprograms(SLPs)arepre-sented,basedrespectively,onvariableelimina-tionandimportancesampling.WethenshowhowSLPscanbeusedtorepresentpriordistri-butionsformachinelearning,using(i)logicpro-gramsand(ii)Bayesnetstructuresasexamples.Drawingonexistingworkinstatistics,weapplytheMetropolis-HastingalgorithmtoconstructaMarkovchainwhichsamplesfromtheposteriordistribution.APrologimplementationforthisisdescribed.Wealsodiscussthepossibilityofcon-structingexplicitrepresentationsoftheposterior.

brieflyshowshowvariableeliminationcanbeusedforex-actinferenceinSLPs.AftershowinghowtoextendSLPswithnon-equationalconstraintsinSection7,wecanfinallybringmuchoftheprecedingworktogetherinSections8and9toshowhowSLPscanbeusedtorepresentdistribu-tionsovercomplexmodelspaces,suchasisrequiredfor‘reallyBayesian’machinelearning.Section10containsconclusionsandpointerstopossiblefuturework.

2LogicProgrammingEssentials

Anoverviewoflogicprogrammingcanbefoundin(Cussens,1999).Here,lackofspacemeansthatonlythemostimportantdefinitionsareflagged.AnSLD-derivation

usingalogicprogramisa(finiteorinfi-ofagoal

nite)sequence,eachelementofwhichiseithera4-tupleortheemptygoalwhere

1Introduction

Astochasticlogicprogram(SLP)isaprobabilisticexten-sionofanormallogicprogramthathasbeenproposedasa

flexiblewayofrepresentingcomplexprobabilisticknowl-edge;generalising,forexample,HiddenMarkovModels,StochasticContext-FreeGrammarsandMarkovnets(Mug-gleton,1996;Cussens,1999).However,weneedtoask(i)whetherthisincreaseinflexibilityisneededforanyrealproblemsand(ii)whetherreasonablealgorithmsexistforinferenceandlearninginSLPs.

Inthispaperwegiveanumberofapproachestoapprox-imateandexactinferenceinSLPs,focusingmostlyonsampling.Moreimportantly,weapplySLPstoanimpor-tantproblem—Bayesianmachinelearning—whichwouldbedifficulttohandlewithsimplerrepresentations.Thepaperisorganisedasfollows.Section2containsessentialdefinitionsfromlogicprogramming.Section3showswecanuseSLPstodefinethreedifferentsortsofdistributions,butfocusesontheappealinglysimpleloglin-earmodel.Sections4and5givetwoquitedifferentwaysofimprovingsamplingfortheloglinearmodel,beingbasedonPrologandimportancesampling,respectively.Section6

istheselectedatomofgoal

;

istheselectedinputclausein,withitsvariablesrenamedsothatandhavenovariablesincom-mon;

isthemostgeneralunifierofandof)orfailiftheycannotbeunified.

(thehead

If

of

failthenfail.Otherwise,istheresultofreplacingby(thebody)inandthenapplyingtotheresult.If

then.

AnSLD-refutationisafiniteSLD-derivationendinginthe

emptygoal.TheSLD-treeforagoalisatreeofgoals,withasrootnode,andsuchthatthechildrenofanygoalaregoalsproducedbyoneresolutionstepusing(andfailhavenochildren).Fig6showsanSLD-tree.Acomputedanswerforagoalisasubstitutionforthevari-ablesinproducedbyanSLD-refutationof.

3DefiningDistributionswithSLPs

Astochasticlogicprogram(SLP)isalogicprogramwheresomeofthepredicates(thedistribution-definingorproba-bilisticpredicates)havenon-negativenumbersattachedtotheclauseswhichmakeuptheirdefinitions.Wedenotethe

by.Wealsowrite,andlabelforaclause

wedenotetheparametervectorcontainingalltheby.InthispaperwewillconsideronlynormalisedSLPswherelabelsfortheclausesmakingupthedefinitionofapredi-catesumtoone.

Thebasicideaisthatclauselabelsprobabilisticallyinflu-encewhichinputclauseischosenasaderivationproceeds,thusdefiningadistributionoverderivationsfromwhichdistributionsover(i)refutationsand(ii)variablebindingsmaybederived.WebeginbyrestrictingattentiontopureSLPs,whereallpredicateshavelabelleddefinitions,post-poningtheimpurecaseuntilSection7.3.1LoglinearModel

wecanGivenanSLPwithparametersandagoal

sampleSLD-derivationsforusingloglinearsamplingasfollows:

Loglinearsampling:Anycomputationrulemaybeusedtoselecttheatomfromthecurrent

ischosenwithgoal.Thenextinputclause

probabilityfromthoseclausesinwiththesamepredicatesymbolintheheadas.Westopwhenweproduceeitherfailor.

Letdenotethesetofrefutationsofthegoal(i.e.

bethenumberoftimesderivationsthatendin).Let

labelledclauseisusedinrefutation.Theloglineardistributionoverrefutationsofthegoalis

Thisistheoriginalsamplingmechanismdefinedby(Mug-gleton,1996).ItisatheoremoflogicprogrammingthatnormalSLD-refutationisessentiallyunalteredbychang-ingtheatomselectionrule(althoughdrasticchangesinef-ficiencycanoccur).Ifwedelayselectinganatom,allthatchangesisthatmaybemoreinstantiatedwhenweeventuallydoselectit.Unfortunately,thismeansthatde-layingtheselectionofwillalterunifandsoweareobligedtofixtheselectionruleinunification-constrainedsamplingsothatasingledistributionisdefined.

Theunification-constraineddistributionoverrefu-tationsforagoalis

unif

Computingtheunnormalisedprobability(potential)ofarefutationisonlyslightlylesseasythanwiththeloglinear

model,sinceallthevalues,for

unif,canbequicklyfoundastherefutationproceeds.

3.3BacktrackableModel

Unification-constrainedsamplingstopsassoonasitreachesfail,backtrackablesamplingbacktracksuponfailure,andsowillgenerallybeamoreefficientwayofsamplingrefutations.Backtrackablesamplingisessen-tiallyPrologwithaprobabilisticclauseselector.

Backtrackablesampling:Theselectedatomisalwaystheleftmostatominthecurrentgoal.Thenextinputclauseischosenwithprobability.Ifweproducefailthenwebacktracktothemostrecentchoice-point,deletethechoiceofinputclausethatledtofailureandchoosefromamongstthesurvivingchoiceswithprobabilityproportionaltoclauselabels.Ifnochoicesremainthenwebacktrackfurtheruntilachoicecanbemadeorreturnfail.Westopifweproduceorfail.

Letsuccbethesetofinputclauseswhichleadtoatleastonerefutation.Thebacktrackabledistribution

overrefutationsforagoalis

succ

Computingtheunnormalisedprobability(potential)ofarefutationis,ingeneral,hard;sincewemayhavetoex-ploreaverylargeSLD-treerootedat

toidentifythesetsucc.

Comparingtheloglinear,unification-constrainedandback-trackablemodelsweseethereisatrade-offbetweeneaseofsamplingarefutation,andeaseofcomputingapotentialforagivenrefutation.Ifonlysamplingisrequiredandwearehappythattheorderofliteralsinclausesmatters,thenthebacktrackablemodelmakessense.However,theloglin-earmodelhasattractivelysimplemathematicalpropertieswhichwewillexploitintheMCMCapplication.Fortu-nately,loglinearsamplingcanbespedupasthenextsec-tionshows.

4ImprovingLoglinearSampling

(Muggleton,1996)explicitlyintroducedSLPsasgeneral-isationsofHiddenMarkovmodels(HMMs)andStochas-ticContext-FreeGrammars(SCFGs).ComparingSLPstoHMMsweseethatinSLPs:(i)thestatesofanHMMarereplacedbygoals(ii)theoutputsofanHMMarereplacedbysubstitutions(iii)concatenationofoutputsisreplacedbycompositionofsubstitutions(iv)outputs(substitutions)aregenerateddeterministicallyand(v)statetransitionproba-bilitiesaregivenbyclauselabels.ItisalsomorenaturalinSLPstoassociateoutputs(substitutions)withtransitionsbetweenstates(goals)ratherthanwithstatesthemselves.TheconnectionbetweenSLPsandSCFGsisevencloser.Consider,theSCFGinFig1whichhasbeenimple-mentedasanSLP.Wecangeneratesentencesusinglog-linearsamplingwiththegoal:-s(A,[]).Since

isanSCFG,thequerywillalwayssucceed,eventhoughwedonotallowbacktrackingintheloglinearmodel.Suppose

1:s(A,B):-n(A,C),v(C,D),n(D,B).0.4:n([joe|T],T).0.6:n([kim|T],T).0.3:v([sees|T],T).0.7:v([likes|T],T).

Figure1:

:AnSCFG

now,thatweareinterestedonlyinreflexivesentences.WethenapplyaconstrainttotheSCFG:replacing

1:s(A,B):-n(A,C),v(C,D),n(D,B).with

1:s(A,B):-n(A,C),v(C,D),n(D,B)

A=[N|T1],D=[N|T2].

ormoreconcisely:

1:s([N|T1],B):-n([N|T1],C),

v(C,[N|T2]),n([N|T2],B).

Nowwecannotguaranteethat:-s(A,[]).willal-wayssucceed,thegrammarisnolongercontext-free.Thismeansthatsamplingsentencesfromthenewconditionaldistribution(conditionalontheconstraintbeingsatisfied)islessefficient.Wehavetothrowawayderivationsthatareinconsistentwiththeconstraint,justaswithforwardsam-plinginBayesnetsinthepresenceofevidence.

Intheloglinearmodelwemayselectanyatomfromthecurrentgoal,whichmeansthattheorderofliteralsinthebodiesofclausesdoesnotaffectthedistribution.However,sincewewillusePrologtoimplementSLPs,wecanexploitProlog’sleftmostatomselectionruletoforceconstraintstobeeffectedasearlyaspossible.WedothisbysimplymovingtheconstraintsleftwardssothatPrologencountersthemearlier.Thishastheeffectofproducingfailassoonasourchoiceofinputclauseshasensuredthataderivationcannotsucceed.Fig2hasanorderingofbodyliteralsfors/2thatensuresthataderivationfailsassoonaswepickasecondnounwhichisnotthesameasthefirst—wedon’twastetimechoosingtheverb.Themoralis:itisbettertofailsoonerthanlater.

1:s([N|T1],B):-n([N|T1],C),n([N|T2],B),v(C,[N|T2]).0.4:n([joe|T],T).0.6:n([kim|T],T).0.3:v([sees|T],T).0.7:v([likes|T],T).

Figure2:

:Asimplegrammar

5ImportanceSamplingforSLPs

SLPsareonlyrequiredforcomplexdistributions,whereit

isoptimistictodependonexactinference.Approximateinferencecanbebasedonsampling,wheree.g.toesti-matetheprobabilityofsomeevent,wesamplefromtheSLPandobtaintheevent’srelativefrequency.Unfortu-nately,evenwiththeProlog-basedspeedupgivenabove,pureloglinearsamplingcanstillbeslow.However,since

iseasiertosamplefromthantheloglineardistri-bution,theobvioussolutionistouseimportancesampling(Gelmanetal.,1995).Weproducesamplesfrom

andthenweightthemwiththeimportanceweights

Wehave:

unif

Wecanupdateaswego,sothere-foreitiseasytocomputeunif

weightedsamplesforaparticulargoal,wheretheweightsareknownuptoanormalisingcon-stant.Forapproximateinference,theunknownnormalisingconstantwilloftencancelout.Forexample,itisfrequentlyenoughtoestimatetheratiobetweenprobabilities.

6ExactInferenceinSLPs

Eachrefutationofagoalproducesacomputedanswer—variablebindingsforthevariablesin.Wecande-fineadistributionoverthecomputedanswersfor

bymarginalisation—wesumoverallrefutationsthatproduceacomputedanswer.Itisconvenienttorepresentcomputed

answersbyatoms:Definetheyield

ofarefutationofaunitgoaltobewhereisthecom-putedanswerforusing.Letbeacomputedanswerforthegoal,thenthecorrespondingyieldisand:

subgoals

and

whichdonotsharevariablesthen

Forgoals,suchas,whichcannotbedecomposedintosubgoalswithoutcommonvariables,weareforcedtofindsplittingsubstitutions.Asubstitutionsplitstwogoalsandifbeasetofandsplittingdosubstitutionsnotsharevariables.Letforandwhichincludesallcomputedanswersforthe

goal

restrictedtothecommonvariablesofand.Then:

Wewouldliketofindasmallfairlyquickly.Ifeachvariablecanonlytakeafairlysmallnumberofdis-cretevaluesasisoftenthecaseinBayesiannets,wecanjustgothrougheachofthesevalues.Weendthissectionbynotingthatthesecomputationsneedtobevectorisedtoreturna(finite)distributionoverbindingsforavariable,ratherthanasingleprobability.

7ImpureSLPs

WecanextendthedefinitionofSLPsbygoingbeyondcon-junctionsofequationalconstraintssuchasin.Fig3shows,anSLPforafragmentofFrench,wherethereisaconstraintthatadjectivesandnounsagreeongender.Theg/2predicatethatdefinestheconstraintisunlabelledsinceitplaysnopartindefiningthedistributionexcepttocutoutderivationswhichareinconsistentwithit.Whenanunlabelledpredicateisencounteredinaderivationweonlyconsiderthefirstvariablebindingitproduces(ifany).Backtrackingmaybeusedtoproducethisfirstbinding,butwemaynotreturntoseekanotherbindingifwehitfailurelateron.

Equationalconstraintscanbeplacedasearlyaswelike.Forotherconstraints,placingthemtooearlycanpreventpermissiblederivationsfrombeingfound.The2nd(com-mentedout)versionofs/2inFig3hastheg(A,G)tooearly.Sincebacktrackingisbanned,onlythefirstilbind-ingwillbefound,overlyconstrainingthevalueofAsoe.g.elleseravieillewillneverbeproduced.Inthecorrectver-sionweprobabilistically(partially)instantiatethevariableA,byourchoiceofinputclauseandonlytheneffecttheconstraintonG.Sinceweareallowedtobacktrackwithinthecalltog/2thiscallwillalwayssucceed.Thekeyisthatitmustbetheprobabilisticpredicatesthatchoosethevariablebindingsthatmatter.

Effectingnegatedconstraintstooearlycanletthroughmorederivationsthanissafe.Thethirdversionofs/2startswithanegatedgoalthatwillsucceedandproducenovariablebindings—sonoconstraintiseffected.Hadthis

doublenegationbeenattheendoftheclauseitwouldhaveeffectedthedesiredconstraint.

%Constrainttoolate-inefficient%1:s(A,B):-n(A,C),v(C,D),a(D,B),%g(A,G),g(D,G).%Constrainttooearly-overconstrained%1:s(A,B):-g(A,G),n(A,C),%a(D,B),g(D,G),v(C,D).%Constrainttooearly-underconstrained%1:s(A,B):-\\+\\+(g(A,G),g(D,G)),

n(A,C),v(C,D),a(D,B).%Constraintattherighttime.1:s(A,B):-n(A,C),g(A,G),

a(D,B),g(D,G),v(C,D).

0.4:n([il|T],T).0.6:n([elle|T],T).0.3:v([est|T],T).0.7:v([sera|T],T).0.2:a([vieux|T],T).0.8:a([vieille|T],T).g([il|_],m).g([elle|_],f).g([vieux|_],m).g([vieille|_],f).

Figure3:

:Genderagreementconstraint

8MachineLearningforDogmaticBayesians

Finally,neverforgetthatthegoalofBayesiancomputationisnottheposteriormode,nottheposteriormean,butarepresentationoftheen-tiredistribution,orsummariesofthatdistribu-tionsuchas95%intervalsforestimandsofin-terest(Gelmanetal.,1995,p.301)(italicsintheoriginal)

‘Bayesian’approachesinmachinelearningdonotliveuptothisexactingdemandtorepresenttheentireposterior,usuallysettlingforjusttheposteriormode(MAPalgo-rithms)orparticularexpectations(Bayesoptimalclassifi-cation).Inthispaper,weshowhowSLPscanbeusedtodefinepriorsrepresentingawiderangeofbiasesandcon-straintsandalsoshowhowtosamplefromposteriors.Al-thoughwefallshortofconstructing(usable)explicitrep-resentationsoftheposterior,suchapossibilitycannotberuledout.

Ourapproachisbasedontheprocesspriorapproachfordecisiontreesdevelopedindependentlyby(Chipmanetal.,1998)and(Denisonetal.,1998).Ourpresentationwillfollowthatgivenby(Chipmanetal.,1998).Inshort:

Insteadofspecifyingaclosed-formexpression

forthetreeprior,

,wespecifyim-plicitlybyatree-generatingstochasticprocess.Eachrealizationofsuchaprocesscansimplybe

consideredarandomdrawfromthisprior.Fur-thermore,manyspecificationsallowforstraight-forwardevaluationof

foranyandcanbeeffectivelycoupledwithefficientMetropolis-Hastingssearchalgorithms(Denisonetal.,1998)

WecanuseMetropolis-Hastingstosamplefromtheposte-riordistributionovertrees,bychoosinganinitialtreeandproducingnewtreesasfollows(whereisjustthelikelihoodwithtree)(Denisonetal.,1998):

1.Generateacandidatevaluewithprobabilitydistri-

bution.

2.Set

withprobability

elseset

BecauseSLPsdefinedistributionsoverfirst-orderatomicformulae—theyieldsofrefutations—theycaneasilyrep-resentdistributionsovermodelstructuressuchasdecisiontrees,Bayesiannetsandlogicprograms.Wewilldenotemodelsusing,possiblysuperscripted.Thesimplest,butpossiblyveryinefficientapproachtodefiningpriorsoverthemodelspacewithSLPsisasfollows:model(M):-gen(M),ok(M).

gen/1generatespossiblemodelsjustlikeaSCFGgen-eratessentences:therearenoconstraintssoweneverhitfail.ok/1isthenaconstraintwhichfiltersoutmodelswhichwedonotwishtoincludeinthemodelspace.

Ifok/1rejectsmanyofthemodelsgeneratedbygen/1,thenitwillbeinefficienttosamplefromthepriorandthisinefficiencytranslatestoinefficiencywhenrunningtheMetropolis-Hastingsalgorithm.Thesolution,asex-plainedinSection4istomoveconstraintsasearlyaspos-siblewithoutalteringthedistribution.Thishasbeendone

intheSLPs

andofwhichfragmentscanbefoundinFig4andFig5.Thesedefinepriorsoverlogicpro-gramsandBayesiannetworksrespectively.Ineachcase,wesimplydefinewhatamodelis,usingfirst-orderlogic,andthenaddlabelstodefineadistributionoverthismodelspace.Inwehaveconstraintsthatwedonotwantanemptytheoryandthatanygeneratedruleshavenotpre-viouslybeengenerated(newrule/2).Wealsohavea‘utility’constraintmake_nice/2whichalwayssucceedsandjustrewritesthegeneratedlogicprogramintoamore

convenientform.In

weassumethatthevariableRVsisalwaysinstantiatedtoalistofnamesofrandomvari-ables,sothat

isusedtodefinedistributionsoftheform,wherealltheprobabilisticinforma-tionisassociatedwithparents/3.Amoreefficientver-sionwouldpushtheacyclic/1constraintearlier.

model(LP):-theory([],LP).0.1:theory(Done,NicelyDone):-\\+Done=[],

make_nice(Done,NicelyDone).0.9:theory(RulesSoFar,Done):-rule(Rule),

newrule(RulesSoFar,Rule),

theory([Rule|RulesSoFar],Done).

Figure4:Fragmentof,anSLPdefiningapriorover

logicprograms

model(RVs,Net):-net(RVs,RVs,Net),

acyclic(Net).net([],_,[]).

net([H|T],RVs,Net):-parents(H,RVs,Ps),append(Ps,TNet,Net),net(T,RVs,TNet).

Figure5:Fragmentof,anSLPdefiningaprioroverBayesiannets(forafixedsetofrandomvariables)

8.1ImaginaryModels

Havingcarefullyfilteredoutunwantedmodels,wefindthatitisconvenienttore-admitthemtothemodelspacewhenweimplementourposteriorsamplingalgorithm.Howeveralltheseimaginarymodels,whichpreviouslydidnothaveaprobabilitydefinedforthem,willnowgetprobabilityzero.Doingthisensuresthatgeneratinganewproposedmodel

using

issimple.Iftheproposedmodelisimaginarythenwewillneveracceptit:sincewehave.Ananalogousapproachexistsinanalysis,whereitisofteneasiertodorealanalysiswithinthespaceofcomplexnumbers.

Recallthatthedistribution

overatomsisgeneratedbymarginalisationfromadistribu-tionofthesamenameoverrefutationsof.Extendingourdefinitiontoincludezeroprobabilityimagi-narymodelsamountstoextendingthisunderlyingdistribu-tiononrefutationstoalsoincludezeroprobabilitySLD-derivationsthatendinfail.Itturnsoutthatthislastdistributiononderivationsisthemostconvenienttoworkwith.NotethateachderivationcorrespondstoaleafinanSLD-tree.Wewillassociatealeafcorrespondingtoafailurederivationwithfailandaleafcorrespondingtoarefutationwiththemodelyieldedbythatrefutation.Non-leafnodesoftheSLD-treewillbeassociatedwithgoals(seeFig6).

8.2TheTransitionKernel

Wecangenerateanewderivation(yieldinganewproposed

model

)fromthederivationwhichyieldedthecurrentmodelasfollows.

1.BacktrackonesteptothemostrecentchoicepointintheSLD-tree2.Wethenprobabilisticallybacktrackasfollows:Ifatthetopofthetreestop.Otherwisebacktrackonemoresteptothenextchoicepointwithprobability.

3.Useloglinearsamplingtogenerateaderivationfromthechoicepointchoseninpreviousbacktrackingstep.However,inthefirststepofsamplingwemaynotchoosethebranchthatleadsbackto.

Ifthederivationsofoundendsinfailthenisanimaginarymodel,so,andwestayat.Theparametercontrolsthesizeofsteps;if,wealwaysrestartfromthetopofthetree.

Now,wemustcalculatewhenturnsouttobearealmodel.Firstletbethedeepestcommonpar-entgoalforand.(Strictly,weshouldsay“forthe

derivationsthatyield

and”,butfromnowonwewillabbreviateinthisway.)Thereisonlyonewaywecangetfromand:backtrackingtoandthenreaching

from.Wecannotgoviasomeparentof(suchasinFig6)sincewehaveprohibitedgoingdownthetreethesamewaywebacktrackedupit.

Supposethat

isatdepthintheSLD-tree,thentheprobabilityofbacktrackingthroughchoicepointsis

forandfor.Let

betherandomvariablethatgivesthenumberofback-tracksfrom.Similarly,wehavefor.Thenwe

findthat

whetherisatthetop-levelchoicepointornot.

G_0not a choice pointfailGCiC*...............M*......MiFigure6:Jumpingfromto

intheSLD-tree

Theprobabilityofreachingstartingfromis

whereistheclausewhich

isusedat

togetto.So.Swappingandsym-bolsgivesus.TheSLD-treeinFig6showsanexample,where.Notetheimag-inarymodelunderthetop-levelgoal.Next,notethat

,tosincecancellinginoutthejustlabelsisconstantleavesofclausesforuswiththatanythosegetusbelow

thatfromgetusasfaras.Inparticular

Finallynotethat

,sinceweareactuallydeal-ingwithderivationsthatyieldmodels,andthenormalising

factorcancelsout.Puttingallthistogether:

theory(main,In,Out)-->

select_clause(theory_2,ClauseNum),theory(ClauseNum,In,Out).

theory(theory_2_1,Done,NicelyDone)-->{\\+Done=[]},

{nice_all(Done,NicelyDone)}.

theory(theory_2_2,RulesSoFar,Done)-->rule(Rule),

{newrule(RulesSoFar,Rule)},!,

theory(main,[Rule|RulesSoFar],Done).labels(theory_2,[cdp(theory_2_1,0.1,0.1),cdp(theory_2_2,0.9,1)]).

Figure7:AnSLPinProlog

Wehaveyettodorealexperimentstotestwhetheroursam-plingalgorithmconvergesonthetrueposterior,butatleasthaveaworkingimplementationthatisreasonablyefficient,andwhichcanbeusedtoexploretheconsequencesofal-teringvariousparameters.Runningthealgorithmwithnodata(sothelikelihoodsdonotneedcalculating)usingtheprioroverlogicprogramstookalittleover9secondsofCPUtimetoproduce(andwriteout)10000samplesonaPentium233MHzrunningYapProlog.Thisinvolved6

acceptancesofaproposed

,anacceptancerateofonly5.46%andinvolved337distinctlogicprograms.Whenrunusingadatasetof5positiveand5negativeexamplesandasimple10%classificationnoiselikelihoodfunction,10000samplestook11.5seconds,involving451jumpsand465distinctlogicprograms.Theserunsweredoneusingabacktrackprobabilityof.Reducingto0.3pro-ducedonly39jumpsoutof10000.

10ConclusionsandFutureDirections

WehavedefinedanumberofalgorithmsforSLPs,together

withrelevantmathematicalanalysis.ThisgoessomewaytoestablishingthatSLPscanbeausefulframeworkforrep-resentingandreasoningwithcomplexprobabilitydistribu-tions.WeviewtheapplicationtoBayesianmachinelearn-ingasbeingthemostpromisingareaforfutureresearch.Thedefinitionofageneral-purposeandpracticaltransitionkernelisprobablythepaper’smajorcontribution.How-ever,itremainstobeprovenbyrigorousexperimentationthatourposteriorsamplerproducesbetterresultsthanmoreconventionalsearch-basedapproaches.Also,wehavealsoyettogiveaproperaccountofterminationwhensamplingfromSLPs.

Inthispaper,wehaveonlyconsideredpriorsoverstruc-tures,notparameters;butitiseasytoembedbuilt-inpred-icatesinthePrologcodetogeneratee.g.samplesfromaDirichlet.Moreinterestingly,thereisthepossibilityofcombiningthelikelihoodwiththepriortogenerateapos-teriorinthesameformastheprior.ItiseasytoconstructanimpracticalSLPfortheposterioroftheform:

posterior(Model):-prior(Model),likelihood(Model).

Theinterestingquestioniswhetherthisimpracticaldefini-tioncanbetransformedintoausablerepresentation.Oneproblemhere,isthatthesizeofanefficientrepresentationoftheposteriorislikelytoexplode,giventhattheposteriorisgenerallymorecomplexthanthemanually-derivedprior.Weconcludebypointingto(Cussens,2000)whereamuchmoredetailedaccountofSLPsisgiven,andwheretheEMalgorithmisappliedtoestimateSLPparametersfromdata.Acknowledgments

SpecialthankstoBlaiseEganforpreventingmefromre-inventingthewheelandGillianHigginsforputtingupwithme.ThanksalsotoStephenMuggletonandSureshMan-andharforusefuldiscussions.

References

Chipman,H.A.,George,E.I.,andMcCulloch,R.E.

(1998).BayesianCARTmodelsearch.JournaloftheAmericanStatisticalAssociation,39(443):935–960.withdiscussion.Cussens,J.(1999).Loglinearmodelsforfirst-orderprob-abilisticreasoning.InProceedingsoftheFifteenthAnnualConferenceonUncertaintyinArtificialIntel-ligence(UAI–99),pages126–133,SanFrancisco,CA.MorganKaufmannPublishers.Cussens,J.(2000).Parameterestimationinstochasticlogic

programs.SubmittedtoMachineLearning.Denison,D.G.T.,Mallick,B.K.,andSmith,A.F.M.

(1998).ABayesianCARTalgorithm.Biometrika,85(2):363–377.Gelman,A.,Carlin,J.,Stern,H.,andRubin,D.(1995).

BayesianDataAnalysis.Chapman&Hall,London.Muggleton,S.(1996).Stochasticlogicprograms.In

deRaedt,L.,editor,AdvancesinInductiveLogicPro-gramming,pages2–2.IOSPress.Muggleton,S.(2000).Semanticsandderivationfor

stochasticlogicprograms.InUAI-2000WorkshoponFusionofDomainKnowledgewithDataforDecisionSupport.Riezler,S.(1998).ProbabilisticConstraintLogicPro-gramming.PhDthesis,Universit¨atT¨ubingen.AIMSReport5(1),1999,IMS,Universit¨atStuttgart.

因篇幅问题不能全部显示,请点此查看更多更全内容

Copyright © 2019- sceh.cn 版权所有 湘ICP备2023017654号-4

违法及侵权请联系:TEL:199 1889 7713 E-MAIL:2724546146@qq.com

本站由北京市万商天勤律师事务所王兴未律师提供法律服务