版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請進行舉報或認領(lǐng)
文檔簡介
2025-1-9
Correspondingauthor(s):SamuelSchmidgall(sschmi46@)
AgentLaboratory:UsingLLMAgentsasResearchAssistants
SamuelSchmidgall1,2,YushengSu1,ZeWang1,XimengSun1,JialianWu1,XiaodongYu1,JiangLiu1,ZichengLiu1andEmadBarsoum1
1AMD,2JohnsHopkinsUniversity
arXiv:2501.04227v1[cs.HC]8Jan2025
Historically,scientificdiscoveryhasbeenalengthyandcostlyprocess,demandingsubstantialtimeandresourcesfrominitialconceptiontofinalresults.Toacceleratescientificdiscovery,reduceresearchcosts,andimproveresearchquality,weintroduceAgentLaboratory,anautonomousLLM-basedframeworkcapableofcompletingtheentireresearchprocess.Thisframeworkacceptsahuman-providedresearchideaandprogressesthroughthreestages—literaturereview,experimentation,andreportwritingtoproducecomprehensiveresearchoutputs,includingacoderepositoryandaresearchreport,whileenablinguserstoprovidefeedbackandguidanceateachstage.WedeployAgentLaboratorywithvariousstate-of-the-artLLMsandinvitemultipleresearcherstoassessitsqualitybyparticipatinginasurvey,providinghumanfeedbacktoguidetheresearchprocess,andthenevaluatethefinalpaper.Wefoundthat:(1)AgentLaboratorydrivenbyo1-previewgeneratesthebestresearchoutcomes;
(2)Thegeneratedmachinelearningcodeisabletoachievestate-of-the-artperformancecomparedtoexistingmethods;(3)Humaninvolvement,providingfeedbackateachstage,significantlyimprovestheoverallqualityofresearch;(4)AgentLaboratorysignificantlyreducesresearchexpenses,achievingan84%decreasecomparedtopreviousautonomousresearchmethods.WehopeAgentLaboratoryenablesresearcherstoallocatemoreefforttowardcreativeideationratherthanlow-levelcodingandwriting,ultimatelyacceleratingscientificdiscovery.
。https://AgentLaboratory.github.io
Figure1|AgentLaboratorytakesasinputahumanresearchideaandasetofnotes,providesthistoapipelineofspecializedLLM-drivenagents,andproducesaresearchreportandcoderepository.
AgentLaboratory:UsingLLMAgentsasResearchAssistants
2
1.Introduction
Scientistsfrequentlyfaceconstraintsthatlimitthenumberofresearchideastheycanexploreatanygiventime,resultinginideasbeingprioritizedbasedonpredictedimpact.Whilethisprocesshelpsdeterminewhichconceptsareworthinvestingtimeinandhowbesttoallocatelimitedresourceseffectively,manyhighqualityideasremainunexplored.Iftheprocessofexploringideashadlesslimitations,researcherswouldbeabletoinvestigatemultipleconceptssimultaneously,increasingthelikelihoodofscientificdiscovery.
Inanefforttoachievethis,recentworkhasexploredthecapabilityofLLMstoperformresearchideationandautomatedpapergeneration,whereLLMagentsperformtheroleofhumanscientists
(Baeketal.
(2024
);
Ghafarollahi&Buehler
(2024b
);
Luetal.
(2024a
);
Swansonetal.
(2024
)).Theworkof
Baeketal.
(2024)introducesResearchAgent,whichautomaticallygeneratesresearch
ideas,methods,andexperimentdesigns,iterativelyrefiningthemthroughfeedbackfrommultiplereviewingagentsthatmirrorpeerdiscussionsandleveragehuman-alignedevaluationcriteriatoimprovetheoutputs.
Luetal.
(2024a)exploresfullyautomatedpapergeneration,whereTheAI
Scientistframeworkgeneratesnovelresearchideas,writescode,conductsexperiments,andcreatesafullscientificpaperwithanautomatedpeer-reviewsystemtoevaluatethework.EventhoughtheseworksdemonstratethatcurrentLLMscangenerateideasjudgedtobemorenovelthanthoseproducedbyhumanexperts,
Sietal.
(2024)indicatesthatLLMsstillexhibitweaknessesinfeasibility
andimplementationdetails,suggestingacomplementaryratherthanreplacementroleforLLMsinresearch.Therefore,weaimtodesignanautonomousagentpipelinethatcanassisthumanstowardimplementingtheirownresearchideas.
Inthiswork,weintroduceAgentLaboratory,anautonomouspipelineforacceleratingtheindividual’sabilitytoperformmachinelearningresearch.Unlikepreviousapproaches,whereagents
participateintheirownresearchideationindependentofhumaninput(Baeketal.
(2024
);
Luetal.
(2024b)),AgentLaboratory
isdesignedtoassisthumanscientistsinexecutingtheirownresearchideasusinglanguageagents.AgentLaboratorytakesasinputahumanresearchideaandoutputsaresearchreportandcoderepositoryproducedbyautonomouslanguageagents,allowingvariouslevelsofhumaninvolvement,wherefeedbackcanbeprovidedatafrequencybasedonuserpreference.Adetailedlistofourcontributionsareprovidedbelow:
1.WeintroduceAgentLaboratory,anopen-sourceLLMagentframeworkforacceleratingtheindividual’sabilitytoperformresearchinmachinelearning.Inordertoaccommodateallusers,AgentLaboratoryiscomputeflexible,wherevariouslevelsofcomputecanbeallocatedbasedontheindividual’saccesstocomputeresource(e.g.,CPU,GPU,memory)andmodelinferencebudget.
2.HumanevaluatorsratedpapersgeneratedusingAgentLaboratoryacrossexperimentalquality,reportquality,andusefulness,showingthatwhiletheo1-previewbackendwasperceivedasthemostuseful,o1-miniachievedthehighestexperimentalqualityscores,andgpt-4owasbehindinallmetrics.
3.NeurIPS-styleevaluationsshowedthato1-previewperformedbestamongbackends,particularlyinclarityandsoundness,accordingtohumanreviewers.However,acleargapemergedbetweenhumanandautomatedevaluations,withautomatedscoressignificantlyoverestimatingquality(6.1/10vs.3.8/10overall).Similardiscrepancieswereseenacrossclarityandcontributionmetrics,suggestingtheneedforhumanfeedbacktocomplementautomatedevaluationsformoreaccurateassessmentsofresearchquality.
4.Co-pilotmodeinAgentLaboratorywasevaluatedoncustomandpreselectedtopics,showinghigheroverallscorescomparedtoautonomousmode.Co-pilotpapersalsosawtrade-offs
AgentLaboratory:UsingLLMAgentsasResearchAssistants
3
inexperimentalqualityandusefulness,reflectingchallengesinaligningagentoutputswithresearcherintent.
5.Theco-pilotfeatureinAgentLaboratoryisoverallfoundtohavehighutilityandusabilitywhenratedbyhumanusers,withmostparticipantsdecidingtocontinueusageaftertheirexperience
6.Detailedcostandinferencetimestatistics,aswellasthebreakdownofcostperpaperphase,arepresentedfordifferentmodelback-ends,demonstratingthatAgentLaboratoryoffersautomaticresearchatagreatlyreducedpricecomparedwithotherworks(only$2.33USDperpaperwithagpt-4obackend).
7.State-of-the-artperformanceonasubsetofMLE-Benchchallengesusingtheproposedmle-solver,achievinghigherconsistencyandscoringcomparedtoothersolvers,andearningmoremedals,
includinggoldandsilver,thanMLAB,OpenHands,andAIDE.
Wehopethatthisworktakesasteptowardacceleratingscientificdiscoveryinmachinelearning,allowingresearcherstoallocatemoreefforttowardcreativeideationandexperimentdesignratherthanlow-levelcodingandwriting.
2.Background&RelatedWork
LargelanguagemodelsTheresearchagentsinthispaperarebuiltonautoregressivelargelanguage
models(LLMs),whicharetrainedonextensivetextcorporatopredictconditionalprobabilitiesoftoken
sequences,p(xt|x<t;θ),andgeneratetextcompletionsthroughsampling,wherext~softmax(W·ht),
withhtasthehiddenstateandWasthelearnedweightmatrixmappingtotokenprobabilities.LLMs
utilizetransformerarchitectures(Vaswani
(2017))tocapturelong-rangedependenciesintext
.These
models,suchasClaude(
Anthropic
(2024)),Llama(
Dubeyetal.
(2024
);
Touvronetal.
(2023a
,b)),
andChatGPT(Achiametal.
(2023
);
Hurstetal.
(2024
);
OpenAI
(2022)),leveragevastdatasets
andscalingtechniques,thusenablingthemtoperformawidearrayoflanguage-basedtasks,suchas
translation,summarization,andreasoning,bygeneralizingpatternslearnedduringpretrainingto
novelinputs
Brown
(2020
).
LLMAgentsWhileLLMsdemonstratestrongunderstandingandreasoningabilities,theyfacechal-lengeswhenexecutingtasksinreal-worldscenarios.Toovercometheselimitations,theircapabilitiesareextendedthroughstructuredframeworks,enablingthemtoautonomouslyandsemi-autonomously
performtaskexecutionandsemi-autonomouslyperformtaskexecution(Chenetal.
(2023b
);
Li
etal.
(2023
);
Qianetal.
(2024
);
Wuetal.
(2023
)).Thesesystems,referredtoasagents,utilize
techniquessuchaschain-of-thoughtprompting(Weietal.
(2022)
),iterativerefinement(Shinnetal.
(2024)),self-improvement(
Huangetal.
(2022)),andexternaltoolintegrationtoexecutecomplex
workflows(Haoetal.
(2024
);
Qinetal.
(2023
);
Schicketal.
(2023
)).LLMagentshavemaderemarkableprogressinsolvingtasksofreal-worldsignificance,suchassoftwareengineering
Jimenez
etal.
(2023
);
Wangetal.
(2024b
);
Yangetal.
(2024)),cybersecurity(Abramovichetal.
(2024
);
Fangetal.
(2024
);
Wanetal.
(2024)),andmedicaldiagnosis(McDuffetal.
(2023
);
Schmidgall
etal.
(2024
);
Tuetal.
(2024
)).TherehasalsobeenprogressinapplyingLLMsagentstoembodied
problemssuchasautonomousrobotics(Blacketal.
(2024
);
Brohanetal.
(2022,
2023);
Kimetal.
(2024)),webtasks(
Dengetal.
(2024
);
Guretal.
(2023
);
Heetal.
(2024
);
Puttaetal.
(2024
);
Shi
etal.
(2017)),andgameplaying(ALetal.
(2024
);
Fengetal.
(2024
);
Wangetal.
(2023
)).ForabroaderoverviewofLLMagents,referto
Wangetal.
(2024a
).
AgentLaboratory:UsingLLMAgentsasResearchAssistants
4
AutomatedmachinelearningAutomatedmachinelearningisanareaofactiveresearch,withmanyapproachesfocusedonusingKaggle,anonlineplatformformachinelearningcompetitions,asabenchmarkforevaluatingagentperformance.NotableeffortsincludeMLE-
Bench(Chanetal.
(2024)),DS-bench(
Jingetal.
(2024)),andMLAgentBench(
Huangetal.
(2024))whichpropose
using75,74,and6KagglechallengesrespectivelyasbenchmarkstomeasuretheabilitiesofMLagentsintaskssuchasdatapreparation,modeldevelopment,andsubmission.SeveralML"solvers"whichcansolveMLchallengeshavebeenintroduced,suchasAIDE(
Schmidtetal.
(2024)),CodeActAgent
(referredtoas“OpenHands")(
Wangetal.
(2024b)),andResearchAgent(referredtoas“MLAB")
fromMLAgentBench(
Huangetal.
(2024))whichautomatefeatureimplementation,bugfixing,and
coderefactoringwithahighsuccessrate.AgentK(
Grosnitetal.
(2024))demonstratestheabilityto
solveKagglechallengesatthehuman-levelwithachallengeURLprovidedasinput.
AIinScientificDiscoveryAIhasbeenusedtosupportscientificdiscoveryacrossnumerousdisci-plinesfordecades.Forinstance,AIhasbeenusedfordiscoveryinmathematics(
Romera-Paredes
etal.
(2024
)
),materialscience(Merchantetal.
(2023
);
Pyzer-Knappetal.
(2022
);
Szymanskietal.
(2023)),chemistry(
Hayesetal.
(2024
);
Jumperetal.
(2021)),algorithmdiscovery(Fawzietal.
(2022)),andcomputationalbiology(
Dingetal.
(2024
)).TheseapproachespositionAIasatoolratherthananagentperformingresearchinautonomousresearch.
LLMsforresearchrelatedtasksLLMshavedemonstratedstrongcapabilitiesindiverseresearch-
relatedtasks,suchascodegeneration(Chenetal.
(2021
);
Nijkampetal.
(2022
)),end-to-endsoftware
development(Haietal.
(2024
);
Phanetal.
(2024
);
Qianetal.
(2023,
2024)),codegenerationfor
discovery(Chenetal.
(2024b
);
Ghafarollahi&Buehler
(2024a
);
Guetal.
(2024
);
Guoetal.
(2024
);
Huetal.
(2024b
);
Ifarganetal.
(2024
);
Majumderetal.
(2024)),researchquestion-answering
(Chenetal.
(2024a
);
Lálaetal.
(2023
);
Linetal.
(2024
);
Songetal.
(2024
)),researchideation
(Baeketal.
(2024
);
Ghafarollahi&Buehler
(2024b
);
Lietal.
(2024a
);
Sietal.
(2024
)),automatedpaperreviewing(
D’Arcyetal.
(2024
);
Liangetal.
(2024
);
Luetal.
(2024b
);
Wengetal.
(2024
)),literaturesearch(
Ajithetal.
(2024
);
Kang&Xiong
(2024
);
Lietal.
(2024b
);
Pressetal.
(2024
)),
andpredictingtheoutcomeofexperiments(Ashokkumaretal.
(2024
);
Lehretal.
(2024
);
Luoetal.
(2024
);
Manningetal.
(2024
);
Zhangetal.
(2024
)).AlthoughLLMshavemadenotableprogressinsolvingtheaforementionedtasks,ideationhasstruggledtoprogress,withsomeworkshowingthat
LLMideationleadstogreaternoveltythanhumans(Sietal.
(2024
)),whileothersshowreducedcreativity(
Chakrabartyetal.
(2024))andgreaterhomogeneouseffects(Andersonetal.
(2024
);
Zhouetal.
(2024))thatmaylimitcreativediscoverywithouthumanguidance
.
Additionally,researchonhuman-AIcollaborationhasreachedmixedconclusionsabouttheidea
novelty(Ashkinazeetal.
(2024
);
Liuetal.
(2024
);
Padmakumar&He
(2024
)).Thesefindingssuggestthat,withthecurrentLLMs,thestrongestresearchsystemswouldcombinehuman-guidedideationwithLLM-basedworkflows.
LLMsforautonomousresearchRecentadvancementsinautomatedscientificworkflowshavefocusedonleveragingLLMstoemulatetheprocessofresearch.
Swansonetal.
(2024
)introducesateamofLLMagentsworkingasscientistsalongsideahumanresearcherwhoprovideshigh-levelfeedback,withtheendresultbeingnovelnanobodybindersaimedataddressingrecentvariantsofSARS-CoV-2.
ChemCrow(M.Branetal.
(2024
)
)andCoscientist(Boikoetal.
(2023
))demonstratetheabilityforautonomousideationandexperimentationinchemistry.
ResearchAgent(Baeketal.
(2024
))automatesresearchideageneration,experimentdesign,anditerativerefinementusingfeedbackfromreviewingagentsalignedwithhumanevaluationcriterion.
TheAIScientist(Luetal.
(2024a
))extends
AgentLaboratory:UsingLLMAgentsasResearchAssistants
5
Figure2|AgentLaboratoryWorkflow.ThisimageillustratesthethreeprimaryphasesofAgentLaboratory:LiteratureReview,Experimentation,andReportWriting,eachfeaturingdistincttasks,tools,andhuman-agentroles.ThepipelineintegrateshumaninputwithLLM-drivenagents,suchasthePhDandPostdocagents,whichhandleliteraturereviews,experimentalplanning,datapreparation,andresultinterpretation.Specializedtoolslikemle-solverforexperimentationandpaper-solverforreportgenerationautomatetediousresearchtasks,enablingcollaborationbetweenhumanresearchersandAItoproducehigh-qualityresearchoutputs.
thisautomationtoencompassend-to-endscientificdiscovery,includingcoding,experimentexecution,andautomatedpeerreviewformanuscriptgeneration.Despitetheseadvancements,studieslike
Sietal.
(2024)highlightlimitationsinthefeasibilityandimplementationdetailsofLLMideation,
indicatingacomplementaryratherthanreplacementroleforLLMsinautonomousresearch.
3.AgentLaboratory
Overview.AgentLaboratorybeginswiththeindependentcollectionandanalysisofrelevantresearchpapers,progressesthroughcollaborativeplanninganddatapreparation,andresultsinautomatedexperimentationandcomprehensivereportgeneration.AsshowninFigure
2,
theoverallworkflowconsistsofthreeprimaryphases:(1)LiteratureReview,(2)Experimentation,and(3)ReportWriting.Inthissection,wewillintroducethesephasesindetailalongwiththecorrespondinginvolvedagents.Furthermore,inSection
4
,wewillconductqualitativeandquantitativeanalysestodemonstratethestrengthsofAgentLaboratoryanditsabilitytogenerate
3.1.LiteratureReview
LiteratureReview.Theliteraturereviewphaseinvolvesgatheringandcuratingrelevantresearchpapersforthegivenresearchideatoprovidereferencesforsubsequentstages.Duringthisprocess,thePhDagentutilizesthearXivAPItoretrieverelatedpapersandperformsthreemainactions:summary,fulltext,andaddpaper.Thesummaryactionretrievesabstractsofthetop20papersrelevanttotheinitialqueryproducedbytheagent.Thefulltextactionextractsthecompletecontentofspecificpapers,andtheaddpaperactionincorporatesselectedsummariesorfulltextsintothecuratedreview.Thisprocessisiterativeratherthanasingle-stepoperation,astheagentperformsmultiplequeries,evaluatestherelevanceofeachpaperbasedonitscontent,andrefinesthe
AgentLaboratory:UsingLLMAgentsasResearchAssistants
6
selectiontobuildacomprehensivereview.Oncethespecifiednumberofrelevanttexts(N=max)isreachedviatheaddpapercommand,thecuratedreviewisfinalizedforuseinsubsequentphases.
3.2.Experimentation
PlanFormulationTheplanformulationphasefocusesoncreatingadetailed,actionableresearchplanbasedontheliteraturereviewandresearchgoal.Duringthisphase,thePhDandPostdocagentscollaboratethroughdialoguetospecifyhowtoachievetheresearchobjective,detailingexperimentalcomponentsneededtocompletethespecifiedresearchideasuchaswhichmachinelearningmodelstoimplement,whichdatasetstouse,andthehigh-levelstepsoftheexperiment.Onceaconsensusisreached,thePostdocagentsubmitsthisplanusingtheplancommand,whichservesasasetofinstructionsforsubsequentsubtasks.
DataPreparation.Thegoalofthedatapreparationphaseistowritecodethatpreparesdataforrunningexperiments,usingtheinstructionsfromtheplanformulationstageasaguideline.TheMLEngineeragentexecutescodeusingPythoncommandcommandandobservesanyprintedoutput.TheMLEngineerhasaccesstoHuggingFacedatasets,searchableviathesearchHFcommand.Afteragreeingonthefinalizeddatapreparationcode,theSWEngineeragentsubmitsitusingthesubmitcodecommand.Beforethefinalsubmissionproceeds,thecodeisfirstpassedthroughaPythoncompilertoensurethattherearenocompilationissues.Thisprocesswillbeiterativelyexecuteduntilthecodeisbug-free.
RunningExperiments.Intherunningexperimentsphase,theMLEngineeragentfocusesonimple-mentingandexecutingtheexperimentalplanformulatedprior.Thisisfacilitatedbymle-solver,aspecializedmoduledesignedtogenerate,test,andrefinemachinelearningcodeautonomously.mle-solverbeginsbyproducinginitialcodebasedontheresearchplanandinsightsfromtheliteraturereview.Forthefirstmle-solverstep,theprogramisemptyandmustgenerateafilefromscratch,whichisusedasthetopscoringprogram.Thefollowingprocessesdescribetheworkflowofthemle-solver:
A.CommandExecution.Duringthecommandexecutionphase,aninitialprogramissampledfromamaintainedsetoftop-performingprograms,whichisrepresentedbyasinglefiledur-inginitialization.Themle-solveriterativelyrefinesthisprogramthroughtwooperations,REPLACEandEDIT,tobetteraligntheoutputwithexperimentalobjectives.TheEDITopera-tionidentifiesarangeoflines,substitutingthecodebetweenthespecifiedlinenumberswithnewlygeneratedcode.Incontrast,theREPLACEoperationgeneratesacompletelynewPythonfile.
B.CodeExecution.Afteracodecommandisexecuted,thenewprogramispassedthroughacompilertocheckforruntimeerrors.Ifitsuccessfullycompiles,ascoreisreturnedandthelistoftopprogramsisupdatedifthescoreishigherthantheexistingprograms.Ifthecodedoesnotcompile,theagentattemptstorepairthecodeforNreptries(Nrep=3inourexperiments)beforereturninganerrorandmovingontoanewcodereplacement.
C.ProgramScoring.Ifacodesucceedsincompilation,itissenttoascoringfunctionwhichdeterminesifitisbetterthanpreviouslyimplementedexperimentcode.Inordertoobtainaprogramscore,weimplementascoringfunctionthatusesanLLMrewardmodeltoassesstheeffectivenessoftheMLcodegeneratedbymle-solver.Therewardmodel,invokedasanLM,scorestheprogramonascalefrom0to1consideringtheoutlinedresearchplan,theproducedcode,andtheobservedoutputtodeterminehowaccuratelytheprogramadheresto
AgentLaboratory:UsingLLMAgentsasResearchAssistants
7
Figure3|Overviewofthemle-solverworkflow.ThisdiagramdetailstheiterativeprocessusedbytheMLE-Solvertoautonomouslygeneratemachinelearningcode.Beginningwithexternalresources,theworkflowintegratescommandexecution(A),wherenewcodeisgenerated,followedbycodeexecution(B)tocompileandrepairissuesifneeded.Programscoring(C)evaluatesthegeneratedcodeusingarewardfunction,whileself-reflection(D)helpsrefinefutureiterationsbasedonresults.Performancestabilization(E)ensuresconsistentoutcomesbymaintainingapooloftop-performingprogramsanditerativeoptimization.
theinitialgoals.Ascoreof1isprovidedforresultswithhighalignmentandeverythingbelowonaspectrumofhowcloselytheoutputandcodematchestheplanninggoals.Thisprocessis
similartoexistingmethodsforLLMreasoningtreesearch(Yaoetal.
(2024
)),whereinsteadofaseriesofreasoningstepsbeingtraversedusingself-evaluatedLLMscoring,thesetofpossibleprogramsarebeingtraversed(viaEDITandREPLACEcommands)andtheresultingprogramoutcomeisself-evaluatedtodetermineifaprogramisworthbuildingon.ThisissimilartotheSolutionSpaceSearchofAIDE(
Schmidtetal.
(2024)),howevertheirmethodwasspecifically
designedfortheKagglecompetitionsandissimplyextractingtheaccuracyratherthanscoringtheresearchcodeandoutcomes.
D.SelfReflection.Whetherthecodesucceedsorfails,aself-reflectionisproducedbasedon
theexperimentalresultsortheencounterederrorsignal(Renze&Guven
(2024
);
Shinnetal.
(2024
)).Here,themle-solverispromptedtoreflectontheoutcomeofitsactions.Iftheprogramfailedtocompile,thesolverreflectsonhowtofixthisissueinnextiterations.Ifitsuccessfulycompilesandreturnsascore,thesolverwillreflectonhowtoincreasethisscore.Thesereflectionsaregeneratedtoimprovefutureperformance,ensuringthatthesystemlearnsfromerrors,improvingthequalityandrobustnessofthegeneratedcodeoveriterativecycles.
E.PerformanceStabilizationTopreventperformancedrift,twomechanismsareimplemented:topprogramsamplingandbatch-parallelization.Intopprogramsampling,acollectionofthehighest-scoringprogramsismaintained,andoneprogramisrandomlysampledbeforeexecutingacommand,ensuringdiversitywhileretainingquality.Forbatch-parallelization,eachsolverstepinvolvesmakingNmodificationssimultaneously,withthetopmodificationselectedtoreplacethelowest-scoringprograminthetopcollection.Thesestrategiesusehigh-entropysamplingtomodifythecode,resultinginabalancebetweenexplorationofnewsolutionsand
AgentLaboratory:UsingLLMAgentsasResearchAssistants
8
Figure4|Graphicaloutlineofpaper-solver.Thisdiagramshowcasesthestep-by-stepprocessofgeneratingandrefiningacademicresearchreportsusingthePaper-Solvertool.Theworkflowstartswiththecreationofaninitialreportscaffold(A)byiterativelygeneratingLaTeX-basedsections,followedbyupdatestoensurestructuralcompleteness.(B)ResearchisperformedthroughanArxivtoolduringrelevantsections.IntheReportEditingphase(C),thelanguagemodelappliestargetededitstoimprovethedocument,withLaTeXcompilationverifyingtheintegrityofchanges.Finally,thecompletedreportundergoesareward-basedevaluationduringthePaperReviewphase(D),ensuringalignmentwithacademicstandardsandresearchgoals.
refinementofexistingonesinordertomaintainstablecodemodifications.
ResultsInterpretation.Thegoaloftheresultsinterpretationphaseistoderivemeaningfulinsightsfromexperimentaloutcomestoinformthefinalreport.ThePhDandPostdocagentsdiscusstheirun-derstandingoftheexperimentalresultsproducedbymle-solver.Oncetheyagreeonameaningfulinterpretationthatcouldcontributetoacompellingacademicpaper,thePostdocagentsubmitsitusingtheinterpretationcommand,formingthebasisforthereportwritingphase.
3.3.ReportWriting
ReportWriting.Inthereportwritingphase,thePhDandProfessoragentsynthesizetheresearchfindingsintoacomprehensiveacademicreport.Thisprocessisfacilitatedbyaspecializedmodulecalledpaper-solver,whichiterativelygeneratesandrefinesthereport.Thepaper-solveraimstoactasareportgenerator,positioningtheworkthathasbeenproducedbypreviousstagesofAgentLaboratory.paper-solverdoesnotaimtoentirelyreplacetheacademicpaper-writingprocess,butrathertosummarizetheresearchthathasbeenproducedinahuman-readableformatsothattheresearcherusingAgentLaboratoryunderstandswhathasbeenaccomplished.Theoutputfollowsthestandardstructureofanacademicpaper,ensuringitmeetsconferencesubmissionrequirements(forthepaperscoringphase)whilebeingclearandmethodical.Thefollowingprocessesdescribetheworkflowofpaper-solver:
A.InitialReportScaffold.Thefirsttaskofthepaper-solveristogenerateaninitialscaffoldfortheresearchpaper.Thisscaffoldoutlinesthedocumentstructure,dividingitintoeightstan-dardizedsections:Abstract,Introduction,Background,RelatedWork,Methods,ExperimentalSetup,Results,andDiscussion.D
溫馨提示
- 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
- 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
- 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
- 5. 人人文庫網(wǎng)僅提供信息存儲空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護處理,對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對任何下載內(nèi)容負責(zé)。
- 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準確性、安全性和完整性, 同時也不承擔(dān)用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。
最新文檔
- 2024年航空器維修與檢修服務(wù)合同范本3篇
- 2025年度船舶船舶動力系統(tǒng)安全評估與風(fēng)險控制合同3篇
- 2025年度智能電網(wǎng)設(shè)備采購與安裝合同6篇
- 2024年詳盡版:國際貨運代理與多式聯(lián)運合同
- 2024年購銷合同實例:買賣合同的詳細操作流程
- 2024銑刨作業(yè)質(zhì)量控制與驗收合同3篇
- 2024年高端機床制造技術(shù)與專利許可協(xié)議
- 2024年沿海地區(qū)海鮮收購合同
- 2025年度智慧城市建設(shè)采購合同管理創(chuàng)新方案3篇
- 2024年版:工程擔(dān)保服務(wù)協(xié)議2篇
- Unit 3 We should obey the rules. Lesson15(說課稿)-2023-2024學(xué)年人教精通版英語五年級下冊
- 2024年聊城市東昌府區(qū)中醫(yī)院招聘備案制工作人員考試真題
- 2025年極兔速遞有限公司招聘筆試參考題庫含答案解析
- 兒科護理安全警示課件
- 2024-2025學(xué)年新疆省克孜勒蘇柯爾克孜自治州三年級數(shù)學(xué)第一學(xué)期期末統(tǒng)考試題含解析
- 一般固廢處理流程
- 舊設(shè)備拆除合同安全責(zé)任書
- 2025年佛山順德區(qū)大良街道辦事處綜合治理辦公室招考聘用專職網(wǎng)格員管理單位筆試遴選500模擬題附帶答案詳解
- 幼兒園一日常規(guī)安全管理
- 考研心理學(xué)專業(yè)基礎(chǔ)(312)研究生考試試題及解答參考(2025年)
- 借條的正規(guī)模板(2024版)
評論
0/150
提交評論