您的当前位置：首页 Stochastic logic programs Sampling, inference and applications

Stochastic logic programs Sampling, inference and applications

来源：尚车旅游网

JamesCussens

DepartmentofComputerScience,UniversityofYork

Heslington,York,YO105DD,UK

jc@cs.york.ac.uk

Abstract

Algorithmsforexactandapproximateinferenceinstochasticlogicprograms(SLPs)arepre-sented,basedrespectively,onvariableelimina-tionandimportancesampling.WethenshowhowSLPscanbeusedtorepresentpriordistri-butionsformachinelearning,using(i)logicpro-gramsand(ii)Bayesnetstructuresasexamples.Drawingonexistingworkinstatistics,weapplytheMetropolis-HastingalgorithmtoconstructaMarkovchainwhichsamplesfromtheposteriordistribution.APrologimplementationforthisisdescribed.Wealsodiscussthepossibilityofcon-structingexplicitrepresentationsoftheposterior.

brieﬂyshowshowvariableeliminationcanbeusedforex-actinferenceinSLPs.AftershowinghowtoextendSLPswithnon-equationalconstraintsinSection7,wecanﬁnallybringmuchoftheprecedingworktogetherinSections8and9toshowhowSLPscanbeusedtorepresentdistribu-tionsovercomplexmodelspaces,suchasisrequiredfor‘reallyBayesian’machinelearning.Section10containsconclusionsandpointerstopossiblefuturework.

2LogicProgrammingEssentials

Anoverviewoflogicprogrammingcanbefoundin(Cussens,1999).Here,lackofspacemeansthatonlythemostimportantdeﬁnitionsareﬂagged.AnSLD-derivation

usingalogicprogramisa(ﬁniteorinﬁ-ofagoal

nite)sequence,eachelementofwhichiseithera4-tupleortheemptygoalwhere

1Introduction

Astochasticlogicprogram(SLP)isaprobabilisticexten-sionofanormallogicprogramthathasbeenproposedasa

ﬂexiblewayofrepresentingcomplexprobabilisticknowl-edge;generalising,forexample,HiddenMarkovModels,StochasticContext-FreeGrammarsandMarkovnets(Mug-gleton,1996;Cussens,1999).However,weneedtoask(i)whetherthisincreaseinﬂexibilityisneededforanyrealproblemsand(ii)whetherreasonablealgorithmsexistforinferenceandlearninginSLPs.

Inthispaperwegiveanumberofapproachestoapprox-imateandexactinferenceinSLPs,focusingmostlyonsampling.Moreimportantly,weapplySLPstoanimpor-tantproblem—Bayesianmachinelearning—whichwouldbedifﬁculttohandlewithsimplerrepresentations.Thepaperisorganisedasfollows.Section2containsessentialdeﬁnitionsfromlogicprogramming.Section3showswecanuseSLPstodeﬁnethreedifferentsortsofdistributions,butfocusesontheappealinglysimpleloglin-earmodel.Sections4and5givetwoquitedifferentwaysofimprovingsamplingfortheloglinearmodel,beingbasedonPrologandimportancesampling,respectively.Section6

istheselectedatomofgoal

;

istheselectedinputclausein,withitsvariablesrenamedsothatandhavenovariablesincom-mon;

isthemostgeneraluniﬁerofandof)orfailiftheycannotbeuniﬁed.

(thehead

failthenfail.Otherwise,istheresultofreplacingby(thebody)inandthenapplyingtotheresult.If

then.

AnSLD-refutationisaﬁniteSLD-derivationendinginthe

emptygoal.TheSLD-treeforagoalisatreeofgoals,withasrootnode,andsuchthatthechildrenofanygoalaregoalsproducedbyoneresolutionstepusing(andfailhavenochildren).Fig6showsanSLD-tree.Acomputedanswerforagoalisasubstitutionforthevari-ablesinproducedbyanSLD-refutationof.

3DeﬁningDistributionswithSLPs

Astochasticlogicprogram(SLP)isalogicprogramwheresomeofthepredicates(thedistribution-deﬁningorproba-bilisticpredicates)havenon-negativenumbersattachedtotheclauseswhichmakeuptheirdeﬁnitions.Wedenotethe

by.Wealsowrite,andlabelforaclause

wedenotetheparametervectorcontainingalltheby.InthispaperwewillconsideronlynormalisedSLPswherelabelsfortheclausesmakingupthedeﬁnitionofapredi-catesumtoone.

Thebasicideaisthatclauselabelsprobabilisticallyinﬂu-encewhichinputclauseischosenasaderivationproceeds,thusdeﬁningadistributionoverderivationsfromwhichdistributionsover(i)refutationsand(ii)variablebindingsmaybederived.WebeginbyrestrictingattentiontopureSLPs,whereallpredicateshavelabelleddeﬁnitions,post-poningtheimpurecaseuntilSection7.3.1LoglinearModel

wecanGivenanSLPwithparametersandagoal

sampleSLD-derivationsforusingloglinearsamplingasfollows:

Loglinearsampling:Anycomputationrulemaybeusedtoselecttheatomfromthecurrent

ischosenwithgoal.Thenextinputclause

probabilityfromthoseclausesinwiththesamepredicatesymbolintheheadas.Westopwhenweproduceeitherfailor.

Letdenotethesetofrefutationsofthegoal(i.e.

bethenumberoftimesderivationsthatendin).Let

labelledclauseisusedinrefutation.Theloglineardistributionoverrefutationsofthegoalis

Thisistheoriginalsamplingmechanismdeﬁnedby(Mug-gleton,1996).ItisatheoremoflogicprogrammingthatnormalSLD-refutationisessentiallyunalteredbychang-ingtheatomselectionrule(althoughdrasticchangesinef-ﬁciencycanoccur).Ifwedelayselectinganatom,allthatchangesisthatmaybemoreinstantiatedwhenweeventuallydoselectit.Unfortunately,thismeansthatde-layingtheselectionofwillalterunifandsoweareobligedtoﬁxtheselectionruleinuniﬁcation-constrainedsamplingsothatasingledistributionisdeﬁned.

Theuniﬁcation-constraineddistributionoverrefu-tationsforagoalis

unif

Computingtheunnormalisedprobability(potential)ofarefutationisonlyslightlylesseasythanwiththeloglinear

model,sinceallthevalues,for

unif,canbequicklyfoundastherefutationproceeds.

3.3BacktrackableModel

Uniﬁcation-constrainedsamplingstopsassoonasitreachesfail,backtrackablesamplingbacktracksuponfailure,andsowillgenerallybeamoreefﬁcientwayofsamplingrefutations.Backtrackablesamplingisessen-tiallyPrologwithaprobabilisticclauseselector.

Backtrackablesampling:Theselectedatomisalwaystheleftmostatominthecurrentgoal.Thenextinputclauseischosenwithprobability.Ifweproducefailthenwebacktracktothemostrecentchoice-point,deletethechoiceofinputclausethatledtofailureandchoosefromamongstthesurvivingchoiceswithprobabilityproportionaltoclauselabels.Ifnochoicesremainthenwebacktrackfurtheruntilachoicecanbemadeorreturnfail.Westopifweproduceorfail.

Letsuccbethesetofinputclauseswhichleadtoatleastonerefutation.Thebacktrackabledistribution

overrefutationsforagoalis

succ

Computingtheunnormalisedprobability(potential)ofarefutationis,ingeneral,hard;sincewemayhavetoex-ploreaverylargeSLD-treerootedat

toidentifythesetsucc.

Comparingtheloglinear,uniﬁcation-constrainedandback-trackablemodelsweseethereisatrade-offbetweeneaseofsamplingarefutation,andeaseofcomputingapotentialforagivenrefutation.Ifonlysamplingisrequiredandwearehappythattheorderofliteralsinclausesmatters,thenthebacktrackablemodelmakessense.However,theloglin-earmodelhasattractivelysimplemathematicalpropertieswhichwewillexploitintheMCMCapplication.Fortu-nately,loglinearsamplingcanbespedupasthenextsec-tionshows.

4ImprovingLoglinearSampling

(Muggleton,1996)explicitlyintroducedSLPsasgeneral-isationsofHiddenMarkovmodels(HMMs)andStochas-ticContext-FreeGrammars(SCFGs).ComparingSLPstoHMMsweseethatinSLPs:(i)thestatesofanHMMarereplacedbygoals(ii)theoutputsofanHMMarereplacedbysubstitutions(iii)concatenationofoutputsisreplacedbycompositionofsubstitutions(iv)outputs(substitutions)aregenerateddeterministicallyand(v)statetransitionproba-bilitiesaregivenbyclauselabels.ItisalsomorenaturalinSLPstoassociateoutputs(substitutions)withtransitionsbetweenstates(goals)ratherthanwithstatesthemselves.TheconnectionbetweenSLPsandSCFGsisevencloser.Consider,theSCFGinFig1whichhasbeenimple-mentedasanSLP.Wecangeneratesentencesusinglog-linearsamplingwiththegoal:-s(A,[]).Since

isanSCFG,thequerywillalwayssucceed,eventhoughwedonotallowbacktrackingintheloglinearmodel.Suppose

1:s(A,B):-n(A,C),v(C,D),n(D,B).0.4:n([joe|T],T).0.6:n([kim|T],T).0.3:v([sees|T],T).0.7:v([likes|T],T).

Figure1:

:AnSCFG

now,thatweareinterestedonlyinreﬂexivesentences.WethenapplyaconstrainttotheSCFG:replacing

1:s(A,B):-n(A,C),v(C,D),n(D,B).with

1:s(A,B):-n(A,C),v(C,D),n(D,B)

A=[N|T1],D=[N|T2].

ormoreconcisely:

1:s([N|T1],B):-n([N|T1],C),

v(C,[N|T2]),n([N|T2],B).

Nowwecannotguaranteethat:-s(A,[]).willal-wayssucceed,thegrammarisnolongercontext-free.Thismeansthatsamplingsentencesfromthenewconditionaldistribution(conditionalontheconstraintbeingsatisﬁed)islessefﬁcient.Wehavetothrowawayderivationsthatareinconsistentwiththeconstraint,justaswithforwardsam-plinginBayesnetsinthepresenceofevidence.

Intheloglinearmodelwemayselectanyatomfromthecurrentgoal,whichmeansthattheorderofliteralsinthebodiesofclausesdoesnotaffectthedistribution.However,sincewewillusePrologtoimplementSLPs,wecanexploitProlog’sleftmostatomselectionruletoforceconstraintstobeeffectedasearlyaspossible.WedothisbysimplymovingtheconstraintsleftwardssothatPrologencountersthemearlier.Thishastheeffectofproducingfailassoonasourchoiceofinputclauseshasensuredthataderivationcannotsucceed.Fig2hasanorderingofbodyliteralsfors/2thatensuresthataderivationfailsassoonaswepickasecondnounwhichisnotthesameastheﬁrst—wedon’twastetimechoosingtheverb.Themoralis:itisbettertofailsoonerthanlater.

Figure2:

:Asimplegrammar

5ImportanceSamplingforSLPs

SLPsareonlyrequiredforcomplexdistributions,whereit

isoptimistictodependonexactinference.Approximateinferencecanbebasedonsampling,wheree.g.toesti-matetheprobabilityofsomeevent,wesamplefromtheSLPandobtaintheevent’srelativefrequency.Unfortu-nately,evenwiththeProlog-basedspeedupgivenabove,pureloglinearsamplingcanstillbeslow.However,since

iseasiertosamplefromthantheloglineardistri-bution,theobvioussolutionistouseimportancesampling(Gelmanetal.,1995).Weproducesamplesfrom

andthenweightthemwiththeimportanceweights

Wehave:

unif

Wecanupdateaswego,sothere-foreitiseasytocomputeunif

weightedsamplesforaparticulargoal,wheretheweightsareknownuptoanormalisingcon-stant.Forapproximateinference,theunknownnormalisingconstantwilloftencancelout.Forexample,itisfrequentlyenoughtoestimatetheratiobetweenprobabilities.

6ExactInferenceinSLPs

Eachrefutationofagoalproducesacomputedanswer—variablebindingsforthevariablesin.Wecande-ﬁneadistributionoverthecomputedanswersfor

bymarginalisation—wesumoverallrefutationsthatproduceacomputedanswer.Itisconvenienttorepresentcomputed

answersbyatoms:Deﬁnetheyield

ofarefutationofaunitgoaltobewhereisthecom-putedanswerforusing.Letbeacomputedanswerforthegoal,thenthecorrespondingyieldisand:

subgoals

and

whichdonotsharevariablesthen

Forgoals,suchas,whichcannotbedecomposedintosubgoalswithoutcommonvariables,weareforcedtoﬁndsplittingsubstitutions.Asubstitutionsplitstwogoalsandifbeasetofandsplittingdosubstitutionsnotsharevariables.Letforandwhichincludesallcomputedanswersforthe

goal

restrictedtothecommonvariablesofand.Then:

Wewouldliketoﬁndasmallfairlyquickly.Ifeachvariablecanonlytakeafairlysmallnumberofdis-cretevaluesasisoftenthecaseinBayesiannets,wecanjustgothrougheachofthesevalues.Weendthissectionbynotingthatthesecomputationsneedtobevectorisedtoreturna(ﬁnite)distributionoverbindingsforavariable,ratherthanasingleprobability.

7ImpureSLPs

WecanextendthedeﬁnitionofSLPsbygoingbeyondcon-junctionsofequationalconstraintssuchasin.Fig3shows,anSLPforafragmentofFrench,wherethereisaconstraintthatadjectivesandnounsagreeongender.Theg/2predicatethatdeﬁnestheconstraintisunlabelledsinceitplaysnopartindeﬁningthedistributionexcepttocutoutderivationswhichareinconsistentwithit.Whenanunlabelledpredicateisencounteredinaderivationweonlyconsidertheﬁrstvariablebindingitproduces(ifany).Backtrackingmaybeusedtoproducethisﬁrstbinding,butwemaynotreturntoseekanotherbindingifwehitfailurelateron.

Equationalconstraintscanbeplacedasearlyaswelike.Forotherconstraints,placingthemtooearlycanpreventpermissiblederivationsfrombeingfound.The2nd(com-mentedout)versionofs/2inFig3hastheg(A,G)tooearly.Sincebacktrackingisbanned,onlytheﬁrstilbind-ingwillbefound,overlyconstrainingthevalueofAsoe.g.elleseravieillewillneverbeproduced.Inthecorrectver-sionweprobabilistically(partially)instantiatethevariableA,byourchoiceofinputclauseandonlytheneffecttheconstraintonG.Sinceweareallowedtobacktrackwithinthecalltog/2thiscallwillalwayssucceed.Thekeyisthatitmustbetheprobabilisticpredicatesthatchoosethevariablebindingsthatmatter.

Effectingnegatedconstraintstooearlycanletthroughmorederivationsthanissafe.Thethirdversionofs/2startswithanegatedgoalthatwillsucceedandproducenovariablebindings—sonoconstraintiseffected.Hadthis

doublenegationbeenattheendoftheclauseitwouldhaveeffectedthedesiredconstraint.

%Constrainttoolate-inefficient%1:s(A,B):-n(A,C),v(C,D),a(D,B),%g(A,G),g(D,G).%Constrainttooearly-overconstrained%1:s(A,B):-g(A,G),n(A,C),%a(D,B),g(D,G),v(C,D).%Constrainttooearly-underconstrained%1:s(A,B):-\\+\\+(g(A,G),g(D,G)),

n(A,C),v(C,D),a(D,B).%Constraintattherighttime.1:s(A,B):-n(A,C),g(A,G),

a(D,B),g(D,G),v(C,D).

Figure3:

:Genderagreementconstraint

8MachineLearningforDogmaticBayesians

Finally,neverforgetthatthegoalofBayesiancomputationisnottheposteriormode,nottheposteriormean,butarepresentationoftheen-tiredistribution,orsummariesofthatdistribu-tionsuchas95%intervalsforestimandsofin-terest(Gelmanetal.,1995,p.301)(italicsintheoriginal)

‘Bayesian’approachesinmachinelearningdonotliveuptothisexactingdemandtorepresenttheentireposterior,usuallysettlingforjusttheposteriormode(MAPalgo-rithms)orparticularexpectations(Bayesoptimalclassiﬁ-cation).Inthispaper,weshowhowSLPscanbeusedtodeﬁnepriorsrepresentingawiderangeofbiasesandcon-straintsandalsoshowhowtosamplefromposteriors.Al-thoughwefallshortofconstructing(usable)explicitrep-resentationsoftheposterior,suchapossibilitycannotberuledout.

Ourapproachisbasedontheprocesspriorapproachfordecisiontreesdevelopedindependentlyby(Chipmanetal.,1998)and(Denisonetal.,1998).Ourpresentationwillfollowthatgivenby(Chipmanetal.,1998).Inshort:

Insteadofspecifyingaclosed-formexpression

forthetreeprior,

,wespecifyim-plicitlybyatree-generatingstochasticprocess.Eachrealizationofsuchaprocesscansimplybe

consideredarandomdrawfromthisprior.Fur-thermore,manyspeciﬁcationsallowforstraight-forwardevaluationof

foranyandcanbeeffectivelycoupledwithefﬁcientMetropolis-Hastingssearchalgorithms(Denisonetal.,1998)

WecanuseMetropolis-Hastingstosamplefromtheposte-riordistributionovertrees,bychoosinganinitialtreeandproducingnewtreesasfollows(whereisjustthelikelihoodwithtree)(Denisonetal.,1998):

1.Generateacandidatevaluewithprobabilitydistri-

bution.

2.Set

withprobability

elseset

BecauseSLPsdeﬁnedistributionsoverﬁrst-orderatomicformulae—theyieldsofrefutations—theycaneasilyrep-resentdistributionsovermodelstructuressuchasdecisiontrees,Bayesiannetsandlogicprograms.Wewilldenotemodelsusing,possiblysuperscripted.Thesimplest,butpossiblyveryinefﬁcientapproachtodeﬁningpriorsoverthemodelspacewithSLPsisasfollows:model(M):-gen(M),ok(M).

gen/1generatespossiblemodelsjustlikeaSCFGgen-eratessentences:therearenoconstraintssoweneverhitfail.ok/1isthenaconstraintwhichﬁltersoutmodelswhichwedonotwishtoincludeinthemodelspace.

Ifok/1rejectsmanyofthemodelsgeneratedbygen/1,thenitwillbeinefﬁcienttosamplefromthepriorandthisinefﬁciencytranslatestoinefﬁciencywhenrunningtheMetropolis-Hastingsalgorithm.Thesolution,asex-plainedinSection4istomoveconstraintsasearlyaspos-siblewithoutalteringthedistribution.Thishasbeendone

intheSLPs

andofwhichfragmentscanbefoundinFig4andFig5.Thesedeﬁnepriorsoverlogicpro-gramsandBayesiannetworksrespectively.Ineachcase,wesimplydeﬁnewhatamodelis,usingﬁrst-orderlogic,andthenaddlabelstodeﬁneadistributionoverthismodelspace.Inwehaveconstraintsthatwedonotwantanemptytheoryandthatanygeneratedruleshavenotpre-viouslybeengenerated(newrule/2).Wealsohavea‘utility’constraintmake_nice/2whichalwayssucceedsandjustrewritesthegeneratedlogicprogramintoamore

convenientform.In

weassumethatthevariableRVsisalwaysinstantiatedtoalistofnamesofrandomvari-ables,sothat

isusedtodeﬁnedistributionsoftheform,wherealltheprobabilisticinforma-tionisassociatedwithparents/3.Amoreefﬁcientver-sionwouldpushtheacyclic/1constraintearlier.

model(LP):-theory([],LP).0.1:theory(Done,NicelyDone):-\\+Done=[],

make_nice(Done,NicelyDone).0.9:theory(RulesSoFar,Done):-rule(Rule),

newrule(RulesSoFar,Rule),

theory([Rule|RulesSoFar],Done).

Figure4:Fragmentof,anSLPdeﬁningapriorover

logicprograms

model(RVs,Net):-net(RVs,RVs,Net),

acyclic(Net).net([],_,[]).

net([H|T],RVs,Net):-parents(H,RVs,Ps),append(Ps,TNet,Net),net(T,RVs,TNet).

Figure5:Fragmentof,anSLPdeﬁningaprioroverBayesiannets(foraﬁxedsetofrandomvariables)

8.1ImaginaryModels

Havingcarefullyﬁlteredoutunwantedmodels,weﬁndthatitisconvenienttore-admitthemtothemodelspacewhenweimplementourposteriorsamplingalgorithm.Howeveralltheseimaginarymodels,whichpreviouslydidnothaveaprobabilitydeﬁnedforthem,willnowgetprobabilityzero.Doingthisensuresthatgeneratinganewproposedmodel

using

issimple.Iftheproposedmodelisimaginarythenwewillneveracceptit:sincewehave.Ananalogousapproachexistsinanalysis,whereitisofteneasiertodorealanalysiswithinthespaceofcomplexnumbers.

Recallthatthedistribution

overatomsisgeneratedbymarginalisationfromadistribu-tionofthesamenameoverrefutationsof.Extendingourdeﬁnitiontoincludezeroprobabilityimagi-narymodelsamountstoextendingthisunderlyingdistribu-tiononrefutationstoalsoincludezeroprobabilitySLD-derivationsthatendinfail.Itturnsoutthatthislastdistributiononderivationsisthemostconvenienttoworkwith.NotethateachderivationcorrespondstoaleafinanSLD-tree.Wewillassociatealeafcorrespondingtoafailurederivationwithfailandaleafcorrespondingtoarefutationwiththemodelyieldedbythatrefutation.Non-leafnodesoftheSLD-treewillbeassociatedwithgoals(seeFig6).

8.2TheTransitionKernel

Wecangenerateanewderivation(yieldinganewproposed

model

)fromthederivationwhichyieldedthecurrentmodelasfollows.

1.BacktrackonesteptothemostrecentchoicepointintheSLD-tree2.Wethenprobabilisticallybacktrackasfollows:Ifatthetopofthetreestop.Otherwisebacktrackonemoresteptothenextchoicepointwithprobability.

3.Useloglinearsamplingtogenerateaderivationfromthechoicepointchoseninpreviousbacktrackingstep.However,intheﬁrststepofsamplingwemaynotchoosethebranchthatleadsbackto.

Ifthederivationsofoundendsinfailthenisanimaginarymodel,so,andwestayat.Theparametercontrolsthesizeofsteps;if,wealwaysrestartfromthetopofthetree.

Now,wemustcalculatewhenturnsouttobearealmodel.Firstletbethedeepestcommonpar-entgoalforand.(Strictly,weshouldsay“forthe

derivationsthatyield

and”,butfromnowonwewillabbreviateinthisway.)Thereisonlyonewaywecangetfromand:backtrackingtoandthenreaching

from.Wecannotgoviasomeparentof(suchasinFig6)sincewehaveprohibitedgoingdownthetreethesamewaywebacktrackedupit.

Supposethat

isatdepthintheSLD-tree,thentheprobabilityofbacktrackingthroughchoicepointsis

forandfor.Let

betherandomvariablethatgivesthenumberofback-tracksfrom.Similarly,wehavefor.Thenwe

ﬁndthat

whetherisatthetop-levelchoicepointornot.

G_0not a choice pointfailGCiC*...............M*......MiFigure6:Jumpingfromto

intheSLD-tree

Theprobabilityofreachingstartingfromis

whereistheclausewhich

isusedat

togetto.So.Swappingandsym-bolsgivesus.TheSLD-treeinFig6showsanexample,where.Notetheimag-inarymodelunderthetop-levelgoal.Next,notethat

,tosincecancellinginoutthejustlabelsisconstantleavesofclausesforuswiththatanythosegetusbelow

thatfromgetusasfaras.Inparticular

Finallynotethat

,sinceweareactuallydeal-ingwithderivationsthatyieldmodels,andthenormalising

factorcancelsout.Puttingallthistogether:

theory(main,In,Out)-->

select_clause(theory_2,ClauseNum),theory(ClauseNum,In,Out).

theory(theory_2_1,Done,NicelyDone)-->{\\+Done=[]},

{nice_all(Done,NicelyDone)}.

theory(theory_2_2,RulesSoFar,Done)-->rule(Rule),

{newrule(RulesSoFar,Rule)},!,

theory(main,[Rule|RulesSoFar],Done).labels(theory_2,[cdp(theory_2_1,0.1,0.1),cdp(theory_2_2,0.9,1)]).

Figure7:AnSLPinProlog

Wehaveyettodorealexperimentstotestwhetheroursam-plingalgorithmconvergesonthetrueposterior,butatleasthaveaworkingimplementationthatisreasonablyefﬁcient,andwhichcanbeusedtoexploretheconsequencesofal-teringvariousparameters.Runningthealgorithmwithnodata(sothelikelihoodsdonotneedcalculating)usingtheprioroverlogicprogramstookalittleover9secondsofCPUtimetoproduce(andwriteout)10000samplesonaPentium233MHzrunningYapProlog.Thisinvolved6

acceptancesofaproposed

,anacceptancerateofonly5.46%andinvolved337distinctlogicprograms.Whenrunusingadatasetof5positiveand5negativeexamplesandasimple10%classiﬁcationnoiselikelihoodfunction,10000samplestook11.5seconds,involving451jumpsand465distinctlogicprograms.Theserunsweredoneusingabacktrackprobabilityof.Reducingto0.3pro-ducedonly39jumpsoutof10000.

10ConclusionsandFutureDirections

WehavedeﬁnedanumberofalgorithmsforSLPs,together

withrelevantmathematicalanalysis.ThisgoessomewaytoestablishingthatSLPscanbeausefulframeworkforrep-resentingandreasoningwithcomplexprobabilitydistribu-tions.WeviewtheapplicationtoBayesianmachinelearn-ingasbeingthemostpromisingareaforfutureresearch.Thedeﬁnitionofageneral-purposeandpracticaltransitionkernelisprobablythepaper’smajorcontribution.How-ever,itremainstobeprovenbyrigorousexperimentationthatourposteriorsamplerproducesbetterresultsthanmoreconventionalsearch-basedapproaches.Also,wehavealsoyettogiveaproperaccountofterminationwhensamplingfromSLPs.

Inthispaper,wehaveonlyconsideredpriorsoverstruc-tures,notparameters;butitiseasytoembedbuilt-inpred-icatesinthePrologcodetogeneratee.g.samplesfromaDirichlet.Moreinterestingly,thereisthepossibilityofcombiningthelikelihoodwiththepriortogenerateapos-teriorinthesameformastheprior.ItiseasytoconstructanimpracticalSLPfortheposterioroftheform:

posterior(Model):-prior(Model),likelihood(Model).

Theinterestingquestioniswhetherthisimpracticaldeﬁni-tioncanbetransformedintoausablerepresentation.Oneproblemhere,isthatthesizeofanefﬁcientrepresentationoftheposteriorislikelytoexplode,giventhattheposteriorisgenerallymorecomplexthanthemanually-derivedprior.Weconcludebypointingto(Cussens,2000)whereamuchmoredetailedaccountofSLPsisgiven,andwheretheEMalgorithmisappliedtoestimateSLPparametersfromdata.Acknowledgments

SpecialthankstoBlaiseEganforpreventingmefromre-inventingthewheelandGillianHigginsforputtingupwithme.ThanksalsotoStephenMuggletonandSureshMan-andharforusefuldiscussions.

References

Chipman,H.A.,George,E.I.,andMcCulloch,R.E.

(1998).BayesianCARTmodelsearch.JournaloftheAmericanStatisticalAssociation,39(443):935–960.withdiscussion.Cussens,J.(1999).Loglinearmodelsforﬁrst-orderprob-abilisticreasoning.InProceedingsoftheFifteenthAnnualConferenceonUncertaintyinArtiﬁcialIntel-ligence(UAI–99),pages126–133,SanFrancisco,CA.MorganKaufmannPublishers.Cussens,J.(2000).Parameterestimationinstochasticlogic

programs.SubmittedtoMachineLearning.Denison,D.G.T.,Mallick,B.K.,andSmith,A.F.M.

(1998).ABayesianCARTalgorithm.Biometrika,85(2):363–377.Gelman,A.,Carlin,J.,Stern,H.,andRubin,D.(1995).

BayesianDataAnalysis.Chapman&Hall,London.Muggleton,S.(1996).Stochasticlogicprograms.In

deRaedt,L.,editor,AdvancesinInductiveLogicPro-gramming,pages2–2.IOSPress.Muggleton,S.(2000).Semanticsandderivationfor

stochasticlogicprograms.InUAI-2000WorkshoponFusionofDomainKnowledgewithDataforDecisionSupport.Riezler,S.(1998).ProbabilisticConstraintLogicPro-gramming.PhDthesis,Universit¨atT¨ubingen.AIMSReport5(1),1999,IMS,Universit¨atStuttgart.

因篇幅问题不能全部显示，请点此查看更多更全内容

查看全文