POS Tagging和Chunking/Shallow Parsing的区别在哪

MT-Oriented English PoS Tagging and Its Application to Noun Phrase Chunking_论文_百度文库
两大类热门资源免费畅读
续费一年阅读会员,立省24元!
MT-Oriented English PoS Tagging and Its Application to Noun Phrase Chunking
中国最大最早的专业内容网站|
总评分0.0|
试读已结束,如果需要继续阅读或下载,敬请购买
定制HR最喜欢的简历
你可能喜欢
您可以上传图片描述问题
联系电话:
请填写真实有效的信息,以便工作人员联系您,我们为您严格保密。Memory-based shallow parsing_百度文库
两大类热门资源免费畅读
续费一年阅读会员,立省24元!
Memory-based shallow parsing
&&We present a memory-based learning (MBL) approach to shallow parsing in which POS tagging, chunking, and identification of syntactic relations are formulated as memory-based modules. The experiments reported in this paper show competitive results, the F fi
阅读已结束,下载文档到电脑
想免费下载更多文档?
定制HR最喜欢的简历
下载文档到电脑,方便使用
还剩5页未读,继续阅读
定制HR最喜欢的简历
你可能喜欢withoutverbs(NPs,PPs)are;Onemaindifferencebetween;5Evaluation;Theparserhasbeenevaluate;oftheBrilltagger,calledf;Inadditiontothestandarde;6ConcludingRemarksandFut;WithoutaSwedishtre
withoutverbs(NPs,PPs)arechosenandsoforth.Ifthisheadconstituentdoesnotcovertheentiresen-tence,remainingconstituentsareaddedoneithersideoftheheadconstituentbasedonanotherpref-erence.The?ttingprocedureworksoutwardfromtheheadconstituent.TheTetrisAlgorithmOnemaindifferencebetweenGTAandHeidornandJensensapproachisthatGTAnevertriestobuildfulltreefromacoregrammar.GTAalwaysmakeaparse?ttingprocedure,bydoingsomanyambiguityandef?ciencyproblemsareavoided.GTA’sapproachtoparse?ttingisnotlinguisticallymotivated,insteaditreliesonlongestmatching.Theconstituentsaresor-tedaccordingtolength.Thenthelongestconstitu-entisselectedfromtherighttotheleft.The?ttingprocedurethentriesto?tinthesecondlongestcon-stituenttotheleft,totherightandinsidetheselectedconstituentandsoforth.Overlappingconstituentscannotbeselected.Thus,thewholesentencewillbeassignedaconstituentstructure,andinaddition,theinternalstrucutureoftheconstituentsis?lledinwhenashorterconstituentcanbe?ttedinalongerconstituent.5EvaluationTheparserhasbeenevaluatedon15000wordsfromtheSUCcorpus.Fivetextgenreswereused.IntheabsenceofaSwedishtreebankannotatedwithconstituencytrees,thetextsweremanuallyannot-atedwithconstituencystructure,withouttop-nodes,basedontheoutputfromtheparser.However,themanualannotationismorehomogenousacrossthephrasetypesthantheoutputofGTA.Thismeansthattherearesystematicerrorsintheoutputfromtheparser.Theevaluationresultsarethereforecal-culatedontheuntunedoutputfromtheparser.Theaccuracyonthephrasestructuretaskis88.7percent(seetable1)andtheF-scorefortheclauseboundarydetectionis88.2percent(seetable2).Intheevalu-ationweusedpart-of-speechtaggeddatafromfourdifferentsources/taggers:abaselinetaggercalledUnigram,whichchoosesthemostfrequenttagforagivenwordandthemostfrequenttag(foropenwordclasses)forunknownwords,theoriginalcorpustagsfromSUC(Ejerhedetal.,1992),afasterversionoftheBrilltagger,calledfnTBL(NgaiandFlorian,2001)andthehiddenMarkovmodel(HMM)taggerTnT(Brants,2000)wereusedintheevaluation.TheparserseemstoworkbestonPPs,APs,VCsandNPs(seetable3).Adverbphrasesandin?nit-iveverbphrasesareidenti?edwithaloweraccur-acy.Itisoftenhardfortherulestodeterminetheendoftheseconstructions.Somenounphrasesareidenti?edwithpostattributesasrelativeclauses,theresultsarenotfullysatisfying,andthereforeonere?nementofGTAshouldbetoexcludeallpost-modifyingphrasesfromtheanalysis.Foramoredetaileddescriptionoftheevaluationsee(Bigertetal.,2003).Inadditiontothestandardevaluationdescribedabove,aglass-boxevaluationofGTA’srobustnesswasmade(Bigertetal.,2003).Inthisevaluationspellingerrorswereautomaticallyintroducedinthetexts,andfedtotheparsingsystem.TheevaluationshowedthatGTAisrobust,anddegradesgracefully,i.e.GTAdegradeslinearlywiththepart-of-speechtaggers’degradation.Inotherwords,ifthetaggerisrobust(i.e.predictable),GTAwillalsoberobust.6ConcludingRemarksandFutureWorkWithoutaSwedishtreebanktheresultsoftheeval-uationarepreliminary,theycanonlyserveasanindicatoroftheparser’sperformance.Thechoicesmadewhenannotatingthetestcorpusareimportantwhenevaluatingaparser.WhenthereisaSwedishtreebankavailable,morereliableandeasycompar-ableevaluationsofGTA1canbemade.ThenextstepinthedevelopmentofGTAistoex-tendtheanalysistoclausetypesandsyntacticfunc-tions.Withsyntacticfunctionsincludedintheana-lysis,GTAcanbecomparednotonlywithparsersassigningconstituencystructure,butpartlywithde-pendencyparsersaswell.1GTAcanbetestedhere:http://skrutten.nada.kth.se/grim/form.htmlTaggerUNIGRAMBRILLTNTAccuracy81.086.288.7Table1:Accuracyinpercentfromtheparsingtask.ParsingbasedontheonthemanualtagginginSUChad88.4?curacy.AbaselineparserusingtheoriginalSUCtagginghad59.0?curacy.Foragivenpart-of-speechtagthebaselineparserassignsthemostfrequentparseforthattag.TaggerUNIGRAMBRILLTNTF?score84.287.388.3Table2:F-scorefromtheclauseboundaryidenti?cationtask.Identi?cationbasedontheoriginalSUCtagginghadanF-scoreof88.2%.Abaselineidenti?erhadanF-scoreof69.0%.Thebaselineidenti?erassignsCLBtothe?rstwordofeachsentenceandCLItotheotherwords.TypeADVPAPINFPNPOPPVCTotalAccuracy81.991.381.991.494.495.392.988.7Count562Table3:F-scoresfortheindividualphrasecategoriesfromtheparsetask.TNTwasusedtotagthetext.ReferencesS.Abney.1991.Parsingbychunks.InR.C.Berwick,S.P.Abney,andC.Tenny,editors,Principle-BasedParsing:ComputationandPsycholinguistics,pages257C278.KluwerAcademicPublishers,Boston.R.BasiliandF.M.Zanzotto.2002.Parsingengineeringandempiricalrobustness.NaturalLanguageEngin-eering,8(2C3):97C120.J.BigertandO.Knutsson.2002.Robusterrordetection:Ahybridapproachcombiningunsupervisederrorde-tectionandlinguisticknowledge.InProc.2ndWork-shopRobustMethodsinAnalysisofNaturallanguageData(ROMAND’02),Frascati,Italy,pages10C19.J.Bigert,O.Knutsson,andJ.Sj¨obergh.2003.Automaticevaluationandrobustnessanddegradationintaggingandparsing.InProc.RANLP2003,pages51C57,Borovets,Bulgaria.J.Birn.1998.Swedishconstraintgrammar.Technicalreport,LingsoftInc,Helsinki,Finland.T.Brants.2000.TntCastatisticalpart-of-speechtag-ger.InProc.6thAppliedNLPConference,ANLP-2000,Seattle,USA.B.Brodda.1983.Anexperimentwithheuristicpars-ingofSwedish.InProc.ofFirstConferenceoftheEuropeanChapteroftheAssociationforCompu-tatlonaLinguistics,pages66C73,Pisa,Italy.E.Ejerhed,G.K¨allgren,O.Wennstedt,andM.Astr¨?om.1992.TheLinguisticAnnotationSystemoftheStockholm-Ume?aProject.DepartmentofLinguistics,UniversityofUme?a,Sweden.E.Ejerhed.1999.Finitestatesegmentationofdis-courseintoclauses.InA.Kornai,editor,ExtendedFi-niteStateModelsofLanguage,chapter13.CambridgeUniversityPress.B.Gamb¨ack.1997.ProcessingSwedishSentences:AUni?cation-BasedGrammarandsomeApplications.Ph.D.thesis,TheRoyalInstituteofTechnologyandStockholmUniversity.J.Hammerton,M.Osborne,S.Armstrong,andW.Daele-mans.2002.Introductiontospecialissueonma-chinelearningapproachestoshallowparsing.J.Ma-chineLearningResearch,SpecialIssueonShallowParsing(2):551C558.T.J¨arvinenandP.Tapanainen.1997.AdependencyparserforEnglish.Technicalreport,DepartmentofLinguistics,UniversityofHelsinki.K.Jensen,G.Heidorn,L.Miller,andL.Ravin.1983.Parse?ttingandprose?xing:gettingaholdonill-formedness.AmericanJournalofComputationalLin-guistics,9(3C4):147C160.K.Jensen.1993.PEG:ThePLNLPEnglishgrammar.InK.Jensen,G.E.Heidorn,andS.D.Richardson,ed-itors,NaturalLanguageProcessing:ThePLNLPAp-proach,pages29C43.Kluwer,Boston,USA.G.K¨allgren.1991.Parsingwithoutlexicon:themorpsystem.InProc.FifthConferenceoftheEuropeanChapteroftheAssociationforComputationalLin-guistics,pages143C148,Berlin,Germany.F.Karlsson,A.Voutilainen,J.Heikkil¨a,andA.Anttila.1995.ConstraintGrammar.ALanguageIndepend-entSystemforParsingUnrestrictedtext.MoutondeGruyter,Berlin,Germany.D.KokkinakisandS.Johansson-Kokkinakis.1999.Acascaded?nite-stateparserforsyntacticanalysisofSwedish.InProc.9thEuropeanChapteroftheAs-sociationofComputationalLinguistics(EACL),pages245C248,Bergen,Norway.AssociationforComputa-tionalLinguistics.X.LiandD.Roth.2001.Exploringevidenceforshal-lowparsing.InWalterDaelemansandR′emiZajac,editors,Proc.ofCoNLL-2001,pages38C44,Toulouse,France.B.Megyesi.2002.ShallowparsingwithPoStaggersandlinguisticfeatures.J.MachineLearningResearch,SpecialIssueonShallowParsing(2):639C668.W.Menzel.1995.Robustprocessingofnaturallan-guage.InProc.19thAnnualGermanConferenceonArti?cialIntelligence,pages19C34,Berlin.Springer.G.NgaiandR.Florian.2001.Transformation-basedlearninginthefastlane.InProceedingsofNAACL-2001,pages40C47,CarnegieMellonUniversity,Pitts-burgh,USA.L.RamshawandM.Marcus.1995.Textchunkingus-ingtransformation-basedlearning.InDavidYarovskyandKennethChurch,editors,Proc.ThirdWorkshoponVeryLargeCorpora,pages82C94,Somerset,NewJersey.AssociationforComputationalLinguistics.A.S?agvallHein,A.Almqvist,E.Forsbom,J.Tiedemann,P.Weijnitz,L.Olsson,andS.Thaning.2002.Scal-ingupanmtprototypeforindustrialuse.Databasesanddata?ow.InProc.ThirdInternationalConfer-enceonLanguageResourcesandEvaluation(LREC2002),pages,LasPalmas,Spain.A.S?agvallHein.1982.Anexperimentalparser.InProc.oftheNinthInternationalConferenceonComputa-tionalLinguistics(Coling82),pages121C126,Prague.E.F.TjongKimSang.2000.Nounphraserepresentationbysystemcombination.InProc.ANLP-NAACL2000,Seattle,Washington,USA.A.Voutilainen.1994.Designingaparsinggrammar.Technicalreport,DepartmentofLinguistics,Univer-sityofHelsinki,Finland.A.Voutilainen.2001.ParsingSwedish.InProc.13thNordicConferenceonComputationalLinguistics(Nodalida-01),Uppsala,Sweden.三亿文库包含各类专业文献、生活休闲娱乐、外语学习资料、幼儿教育、小学教育、高等教育、行业资料、中学教育、93A robust shallow parser for Swedish等内容。 POS Tagging和Chunking/Shallow Parsing的区别在哪? - 知乎8被浏览648分享邀请回答if output=="猴子/NR 喜欢/VV 吃/VV 香蕉/NN 。/PU":
It is POS Tagging.
elif outpu=="(NP 猴子)(VP 喜欢吃香蕉)":
It is Chunking.
A POS tagger can be thought of as a parser which only returns the bottom-most tier of the parse tree to you. A chunker might be thought of as a parser that returns some other tier of the parse tree to you instead. Sometimes you just need to know that a bunch of words together form a Noun Phrase but don't care about the sub-structure of the tree within those words (i.e. which words are adjectives, determiners, nouns, etc and how do they combine). In such cases you can use a chunker to get exactly the information you need instead of wasting time generating the full parse tree for the sentence.Ref:3添加评论分享收藏感谢收起0添加评论分享收藏感谢收起您的访问出错了(404错误)
很抱歉,您要访问的页面不存在。
1、请检查您输入的地址是否正确。
进行查找。
3、感谢您使用本站,1秒后自动跳转

我要回帖

更多关于 POS 的文章

 

随机推荐