![AIGC綜述:從GAN到ChatGPT的生成式人工智能簡史 A Comprehensive Survey of AI-Generated Content (AIGC):A History of Generative AI from GAN to ChatGPT_第1頁](http://file4.renrendoc.com/view/41c45d3262e4650d8e7696e95df52b2c/41c45d3262e4650d8e7696e95df52b2c1.gif)
![AIGC綜述:從GAN到ChatGPT的生成式人工智能簡史 A Comprehensive Survey of AI-Generated Content (AIGC):A History of Generative AI from GAN to ChatGPT_第2頁](http://file4.renrendoc.com/view/41c45d3262e4650d8e7696e95df52b2c/41c45d3262e4650d8e7696e95df52b2c2.gif)
![AIGC綜述:從GAN到ChatGPT的生成式人工智能簡史 A Comprehensive Survey of AI-Generated Content (AIGC):A History of Generative AI from GAN to ChatGPT_第3頁](http://file4.renrendoc.com/view/41c45d3262e4650d8e7696e95df52b2c/41c45d3262e4650d8e7696e95df52b2c3.gif)
![AIGC綜述:從GAN到ChatGPT的生成式人工智能簡史 A Comprehensive Survey of AI-Generated Content (AIGC):A History of Generative AI from GAN to ChatGPT_第4頁](http://file4.renrendoc.com/view/41c45d3262e4650d8e7696e95df52b2c/41c45d3262e4650d8e7696e95df52b2c4.gif)
![AIGC綜述:從GAN到ChatGPT的生成式人工智能簡史 A Comprehensive Survey of AI-Generated Content (AIGC):A History of Generative AI from GAN to ChatGPT_第5頁](http://file4.renrendoc.com/view/41c45d3262e4650d8e7696e95df52b2c/41c45d3262e4650d8e7696e95df52b2c5.gif)
版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請進(jìn)行舉報或認(rèn)領(lǐng)
文檔簡介
arXiv:2303.04226v1[cs.AI]7Mar2023
AComprehensiveSurveyofAI-GeneratedContent(AIGC):AHistoryofGenerativeAIfromGANtoChatGPT
YIHANCAO*,LehighUniversity&CarnegieMellonUniversity,USA
SIYULI,LehighUniversity,USA
YIXINLIU,LehighUniversity,USA
ZHILINGYAN,LehighUniversity,USA
YUTONGDAI,LehighUniversity,USA
PHILIPS.YU,UniversityofIllinoisatChicago,USA
LICHAOSUN,LehighUniversity,USA
Recently,ChatGPT,alongwithDALL-E-2[
1
]andCodex[
2
],hasbeengainingsignificantattentionfromsociety.Asaresult,manyindividualshavebecomeinterestedinrelatedresourcesandareseekingtouncoverthebackgroundandsecretsbehinditsimpressiveperformance.Infact,ChatGPTandotherGenerativeAI(GAI)techniquesbelongtothecategoryofArtificialIntelligenceGeneratedContent(AIGC),whichinvolvesthecreationofdigitalcontent,suchasimages,music,andnaturallanguage,throughAImodels.ThegoalofAIGCistomakethecontentcreationprocessmoreefficientandaccessible,allowingfortheproductionofhigh-qualitycontentatafasterpace.AIGCisachievedbyextractingandunderstandingintentinformationfrominstructionsprovidedbyhuman,andgeneratingthecontentaccordingtoitsknowledgeandtheintentinformation.Inrecentyears,large-scalemodelshavebecomeincreasinglyimportantinAIGCastheyprovidebetterintentextractionandthus,improvedgenerationresults.Withthegrowthofdataandthesizeofthemodels,thedistributionthatthemodelcanlearnbecomesmorecomprehensiveandclosertoreality,leadingtomorerealisticandhigh-qualitycontentgeneration.Thissurveyprovidesacomprehensivereviewonthehistoryofgenerativemodels,andbasiccomponents,recentadvancesinAIGCfromunimodalinteractionandmultimodalinteraction.Fromtheperspectiveofunimodality,weintroducethegenerationtasksandrelativemodelsoftextandimage.Fromtheperspectiveofmultimodality,weintroducethecross-applicationbetweenthemodalitiesmentionedabove.Finally,wediscusstheexistingopenproblemsandfuturechallengesinAIGC.
CCSConcepts:.Computersystemsorganization→Embeddedsystems;Redundancy;Robotics;.Net-works→Networkreliability.
AdditionalKeyWordsandPhrases:datasets,neuralnetworks,gazedetection,texttagging
ACMReferenceFormat:
YihanCao,SiyuLi,YixinLiu,ZhilingYan,YutongDai,PhilipS.Yu,andLichaoSun.2018.AComprehensive
SurveyofAI-GeneratedContent(AIGC):AHistoryofGenerativeAIfromGANtoChatGPT.J.ACM37,4,Article111(August2018),
44
pages.
.org/XXXXXXX.XXXXXXX
https://doi
*IncomingPh.D.studentatLehighUniversity.
Authors’addresses:YihanCao,yihanc@,LehighUniversity&CarnegieMellonUniversity,Pittsburgh,PA,USA;SiyuLi,applicantlisiyu@,LehighUniversity,Bethlehem,PA,USA;YixinLiu,lis221@,LehighUniversity,Bethlehem,PA,USA;ZhilingYan,zhilingyan724@,LehighUniversity,Bethlehem,PA,USA;YutongDai,lis221@,LehighUniversity,Bethlehem,PA,USA;PhilipS.Yu,UniversityofIllinoisatChicago,Chicago,Illinois,USA,psyu@;LichaoSun,lis221@,LehighUniversity,Bethlehem,PA,USA.
Permissiontomakedigitalorhardcopiesofallorpartofthisworkforpersonalorclassroomuseisgrantedwithoutfeeprovidedthatcopiesarenotmadeordistributedforprofitorcommercialadvantageandthatcopiesbearthisnoticeandthefullcitationonthefirstpage.CopyrightsforcomponentsofthisworkownedbyothersthanACMmustbehonored.Abstractingwithcreditispermitted.Tocopyotherwise,orrepublish,topostonserversortoredistributetolists,requirespriorspecificpermissionand/orafee.Requestpermissionsfrompermissions@.
?2018AssociationforComputingMachinery.
0004-5411/2018/8-ART111$15.00
.org/XXXXXXX.XXXXXXX
https://doi
J.ACM,Vol.37,No.4,Article111.Publicationdate:August2018.
111
111:2YihanCao,SiyuLi,YixinLiu,ZhilingYan,YutongDai,PhilipS.Yu,andLichaoSun
1INTRODUCTION
Inrecentyears,ArtificialIntelligenceGeneratedContent(AIGC)hasgainedmuchattentionbeyondthecomputersciencecommunity,wherethewholesocietybeginstobeinterestedinthevariouscontentgenerationproductsbuiltbylargetechcompanies[
3
],suchasChatGPT[
4
]andDALL-E-2[
5
].AIGCreferstocontentthatisgeneratedusingadvancedGenerativeAI(GAI)techniques,asopposedtobeingcreatedbyhumanauthors,whichcanautomatethecreationoflargeamountsofcontentinashortamountoftime.Forexample,ChatGPTisalanguagemodeldevelopedbyOpenAIforbuildingconversationalAIsystems,whichcanefficientlyunderstandandrespondtohumanlanguageinputsinameaningfulway.Inaddition,DALL-E-2isanotherstate-of-the-artGAImodelalsodevelopedbyOpenAI,whichiscapableofcreatinguniqueandhigh-qualityimagesfromtextualdescriptionsinafewminutes,suchas"anastronautridingahorseinaphotorealisticstyle"asshowninFigure
1
.AstheremarkableachievementsinAIGC,manypeoplebelieveitwillbetheneweraofAIandmakesignificantimpactsonthewholeworld.
Figure1
Instruction1:
Anastronautridingahorseinaphotorealisticstyle.
Instruction2:
TeddybearsworkingonnewAIresearchonthemooninthe1980s.
Figure2
DALL·E2
Fig.1.ExamplesofAIGCinimagegeneration.TextinstructionsaregiventoOpenAIDALL-E-2model,anditgeneratestwoimagesaccordingtotheinstructions.
Technically,AIGCrefersto,givenhumaninstructionswhichcouldhelpteachandguidethemodeltocompletethetask,utilizingGAIalgorithmstogeneratecontentthatsatisfiestheinstruction.Thisgenerationprocessusuallyconsistsoftwosteps:extractingintentinformationfromhumaninstructionsandgeneratingcontentaccordingtotheextractedintentions.However,theparadigmofGAImodelscontainingtheabovetwostepsisnotentirelynovel,asdemonstratedbypreviousstudies[
6
,
7
].ThecoreadvancementsinrecentAIGCcomparedtopriorworksaretheresultoftrainingmoresophisticatedgenerativemodelsonlargerdatasets,usinglargerfoundationmodelarchitectures,andhavingaccesstoextensivecomputationalresources.Forexample,themainframeworkofGPT-3maintainsthesameasGPT-2,butthepre-trainingdatasizegrowsfromWebText[
8
](38GB)toCommonCrawl[
9
](570GBafterfiltering),andthefoundationmodelsizegrowsfrom1.5Bto175B.Therefore,GPT-3hasbettergeneralizationabilitythanGPT-2onvarioustasks,suchashumanintentextraction.
Inadditiontothebenefitsbroughtbytheincreaseindatavolumeandcomputationalpower,researchersarealsoexploringwaystointegratenewtechnologieswithGAIalgorithms.Forexample,ChatGPTutilizesreinforcementlearningfromhumanfeedback(RLHF)[
10
–
12
]todeterminethemostappropriateresponseforagiveninstruction,thusimprovingmodel’sreliabilityandaccuracyovertime.ThisapproachallowsChatGPTtobetterunderstandhumanpreferencesinlongdialogues.Meanwhile,incomputervision,stablediffusion[
13
],proposedbyStability.AIin2022,hasalsoshowngreatsuccessinimagegeneration.Unlikepriormethods,generativediffusionmodelscanhelpgeneratehigh-resolutionimagesbycontrollingthetrade-offbetweenexplorationandexploitation,resultinginaharmoniouscombinationofdiversityinthegeneratedimagesandsimilaritytothetrainingdata.
J.ACM,Vol.37,No.4,Article111.Publicationdate:August2018.
Prompt
Prompt
Prompt
InstructionI!
ResultR!
ResultR2
InstructionI2
InstructionI3
ResultR3
ResultR4
AComprehensiveSurveyofAI-GeneratedContent(AIGC):
AHistoryofGenerativeAIfromGANtoChatGPT
111:3
Unimodal
Pleasewriteastoryaboutacat.
Multimodal
Describethispicture.
Drawapictureofacat.
Writeasongaboutacat.
Data
Pre-train
Prompt
Decode
GenerativeAIModels
Data
Pre-train
GenerativeAIModels
InstructionI4
Onceuponatime,
therewasacat
named
Jessy….
Thisisacat.
Fig.2.OverviewofAIGC.Generally,GAImodelscanbecategorizedintotwotypes:unimodalmodelsandmultimodalmodels.Unimodalmodelsreceiveinstructionsfromthesamemodalityasthegeneratedcontentmodality,whereasmultimodalmodelsacceptcross-modalinstructionsandproduceresultsofdifferentmodalities.
Bycombiningtheseadvancements,modelshavemadesignificantprogressinAIGCtasksandhavebeenadoptedinvariousindustries,includingart[
14
],advertising[
15
],andeducation[
16
].Inthenearfuture,AIGCwillcontinuetobeasignificantareaofresearchinmachinelearning.Itisthereforecrucialtoconductanextensivereviewofpastresearchandidentifytheopenproblemsinthisfield.Thissurveyisthefirstonethatfocusesonthecoretechnologiesandapplicationsinthe
fieldofAIGC.
1.1MajorContributions
ThisisthefirstcomprehensivesurveyofAIGCthatsummarizesGAIintheaspectsoftechniquesandapplications.PrevioussurveyshavefocusedonGAIfromvariousangles,includingnaturallanguagegeneration[
17
],imagegeneration[
18
],generationinmultimodalmachinelearning[
7
,
19
].However,thesepriorworksonlyfocusonaspecificpartofAIGC.Inthissurvey,wefirstprovideareviewoffoundationtechniquescommonlyusedinAIGC.Then,wefurtherofferathoroughsummaryofadvancedGAIalgorithms,bothintermsofunimodalgenerationandmultimodalgeneration,asshowninFigure
2
.Inaddition,weexaminetheapplicationsandpotentialchallengesofAIGC.Finally,wehighlighttheopenproblemsandfuturedirectionsinthisfield.Insummary,themaincontributionsofthispaperareasfollows:
?Toourbestknowledge,wearethefirsttoprovideaformaldefinitionandathoroughsurveyforAIGCandAI-enhancedgenerationprocess.
?Wereviewthehistory,foundationtechniquesofAIGCandconductacomprehensiveanalysisofrecentadvancesinGAItasksandmodelsfromtheperspectiveofunimodalgenerationandmultimodalgeneration.
?WediscussthemainchallengesfacingAIGCandfutureresearchtrendsconfrontingAIGC.
1.2Organization
Therestofthesurveyisorganizedasfollows.Section
2
reviewsthehistoryofAIGCmainlyfromtheviewofvisionandlanguagemodalities.Section
3
introducesthebasiccomponentsthatarewidelyusedinnowadaysGAImodeltraining.Section
4
summarizesrecentadvancesofGAImodels,amongwhich,Section
4.1
reviewstheadvancesfromunimodalperspectiveandSection
4.2
reviews
J.ACM,Vol.37,No.4,Article111.Publicationdate:August2018.
111:4YihanCao,SiyuLi,YixinLiu,ZhilingYan,YutongDai,PhilipS.Yu,andLichaoSun
theadvancesfromtheperspectiveofmultimodalgeneration.Amongmultimodalgeneration,weintroducevisionlanguagemodels,textaudiomodels,textgraphmodelsandtextcodemodels.Section
5
andSection
6
introducetheapplicationsofGAImodelsinAIGCandsomeotherimportantresearchthatarerelatedtothisarea.Furthermore,Sections
7
,
8
revealtherisk,openproblemsandfuturedirectionsofAIGCtechnologies.Finally,weconcludeourresearchin
9
.
2HISTORYOFGENERATIVEAI
Generativemodelshavealonghistoryinartificialintelligence,datingbacktothe1950swiththedevelopmentofHiddenMarkovModels(HMMs)[
20
]andGaussianMixtureModels(GMMs)[
21
].Thesemodelsgeneratedsequentialdatasuchasspeechandtimeseries.However,itwasn’tuntiltheadventofdeeplearningthatgenerativemodelssawsignificantimprovementsinperformance.
Inearlyyearsofdeepgenerativemodels,differentareasdonothavemuchoverlapingeneral.Innaturallanguageprocessing(NLP),atraditionalmethodtogeneratesentencesistolearnworddistributionusingN-gramlanguagemodeling[
22
]andthensearchforthebestsequence.However,thismethodcannoteffectivelyadapttolongsentences.Tosolvethisproblem,recurrentneuralnetworks(RNNs)[
23
]werelaterintroducedforlanguagemodelingtasks,allowingformodelingrelativelylongdependency.ThiswasfollowedbythedevelopmentofLongShort-TermMemory(LSTM)[
24
]andGatedRecurrentUnit(GRU)[
25
],whichleveragedgatingmechanismtocontrolmemoryduringtraining.Thesemethodsarecapableofattendingtoaround200tokensinasample[
26
],whichmarksasignificantimprovementcomparedtoN-gramlanguagemodels.
Meanwhile,incomputervision(CV),beforetheadventofdeeplearning-basedmethods,tra-ditionalimagegenerationalgorithmsusedtechniquessuchastexturesynthesis[
27
]andtexturemapping[
28
].Thesealgorithmswerebasedonhand-designedfeatures,andwerelimitedintheirabil-itytogeneratecomplexanddiverseimages.In2014,GenerativeAdversarialNetworks(GANs)[
29
]wasfirstproposed,whichwasasignificantmilestoneinthisarea,duetoitsimpressiveresultsinvariousapplications.VariationalAutoencoders(VAEs)[
30
]andothermethodslikediffusiongenerativemodels[
31
]havealsobeendevelopedformorefine-grainedcontrolovertheimagegenerationprocessandtheabilitytogeneratehigh-qualityimages.
Theadvancementofgenerativemodelsinvariousdomainshasfolloweddifferentpaths,buteventually,theintersectionemerged:thetransformerarchitecture[
32
].IntroducedbyVaswanietal.forNLPtasksin2017,TransformerhaslaterbeenappliedinCVandthenbecomethedominantbackboneformanygenerativemodelsinvariousdomains[
9
,
33
,
34
].InthefieldofNLP,manyprominentlargelanguagemodels,e.g.,BERTandGPT,adoptthetransformerarchitectureastheirprimarybuildingblock,offeringadvantagesoverpreviousbuildingblocks,i.e.,LSTMandGRU.InCV,VisionTransformer(ViT)[
35
]andSwinTransformer[
36
]latertakesthisconceptevenfurtherbycombiningthetransformerarchitecturewithvisualcomponents,allowingittobeappliedtoimagebaseddownstreams.Exceptfortheimprovementthattransformerbroughttoindividualmodalities,thisintersectionalsoenabledmodelsfromdifferentdomainstobefusedtogetherformultimodaltasks.OnesuchexampleofmultimodalmodelsisCLIP[
37
].CLIPisajointvision-languagemodelthatcombinesthetransformerarchitecturewithvisualcomponents,allowingittobetrainedonamassiveamountoftextandimagedata.Sinceitcombinesvisualandlanguageknowledgeduringpre-training,itcanalsobeusedasimageencodersinmultimodalpromptingforgeneration.Inall,theemergenceoftransformerbasedmodelsrevolutionizedAIgenerationandledtothepossibilityoflarge-scaletraining.
Inrecentyears,researchershavealsobeguntointroducenewtechniquesbasedonthesemodels.Forinstance,inNLP,insteadoffine-tuning,peoplesometimespreferfew-shotprompting[
38
],whichreferstoincludingafewexamplesselectedfromthedatasetintheprompt,tohelpthemodelbetterunderstandtaskrequirements.Andinvisuallanguage,researchersoftencombine
J.ACM,Vol.37,No.4,Article111.Publicationdate:August2018.
DDPM
StyleGAN
BigBiGAN
AComprehensiveSurveyofAI-GeneratedContent(AIGC):
AHistoryofGenerativeAIfromGANtoChatGPT
111:5
CV
CV
CV
Unimodal-CV&NLP
CV
BiGAN
RevNet
NLP
SparrowchatGPT
LSTM/GRU
2016
CLIPALBEFBLIPVQ-GAN
VisualBERTViLBERTUNITER
NLP
GPT-3OPTBART
T5
NLP
ELMOBERTGPT-2
DALL-E
BLIP2
DALL-E2
NLP
Transformer
Show-Tell
VL
CAVPDMGANVQ-VAE
StyleNetStackGAN
NLP
N-Gram
GANVAEFlow
ViTMoCo
2014
2018
2020
NLP
VL
VLVLVLVL
Multimodal–VisionLanguage
Fig.3.ThehistoryofGenerativeAIinCV,NLPandVL.
modality-specificmodelswithself-supervisedcontrastivelearningobjectivestoprovidemorerobustrepresentations.
Inthefuture,asAIGCbecomesincreasinglyimportant,moreandmoretechnologiesshallbeintroduced,empoweringthisareawithvitality.
3FOUNDATIONSFORAIGC
Inthissection,weintroducefoundationmodelsthatarecommonlyusedinAIGC.
3.1FoundationModel
3.1.1Transformer.Transformeristhebackbonearchitectureformanystate-of-the-artmodels,suchasGPT-3[
9
],DALL-E-2[
5
],Codex[
2
],andGopher[
39
].ItwasfirstproposedtosolvethelimitationsoftraditionalmodelssuchasRNNsinhandlingvariable-lengthsequencesandcontext-awareness.Transformerarchitectureismainlybasedonaself-attentionmechanismthatallowsthemodeltoattendtodifferentpartsinainputsequence.Transformerconsistsofanencoderandadecoder.Theencodertakesintheinputsequenceandgenerateshiddenrepresentations,whilethedecodertakesinthehiddenrepresentationandgeneratesoutputsequence.Eachlayeroftheencoderanddecoderconsistsofamulti-headattentionandafeed-forwardneuralnetwork.Themulti-headattentionisthecorecomponentofTransformer,whichlearnstoassigndifferentweightstotokensaccordingtheirrelevance.Thisinformationroutingmethodallowsthemodeltobebetterathandlinglongtermdependency,hence,improvingtheperformanceinawiderangeofNLPtasks.Anotheradvantageoftransformeristhatitsarchitecturemakesithighlyparallelizable,andallowsdatatotrumpinductivebiases[
40
].Thispropertymakestransformerwell-suitedforlarge-scalepre-training,enablingtransformerbasedmodelstobecomeadaptabletodifferentdownstreamtasks.
3.1.2Pre-trainedLanguageModels.SincetheintroductionoftheTransformerarchitecture,ithasbecomethedominantchoiceinnaturallanguageprocessingduetoitsparallelismandlearningcapabilities.Generally,thesetransformerbasedpre-trainedlanguagemodelscanbecommonlyclassifiedintotwotypesbasedontheirtrainingtasks:autoregressivelanguagemodelingandmaskedlanguagemodeling[
41
].Givenasentence,whichiscomposedofseveraltokens,theobjectiveofmaskedlanguagemodeling,e.g.,BERT[
42
]andRoBERTa[
43
],referstopredictingtheprobabilityofamaskedtokengivencontextinformation.ThemostnotableexampleofmaskedlanguagemodelingisBERT[
42
],whichincludesmaskedlanguagemodelingandnextsentence
J.ACM,Vol.37,No.4,Article111.Publicationdate:August2018.
T1
TN
…
Trm
Trm
…
Trm
Trm
…
E1
EN
…
T1
T2
…
Trm
Trm
…
Trm
Trm
…
E1
E2
…
T1
T2
…
Trm
Trm
…
Trm
Trm
…
Trm
Trm
…
E1
E2
…
111:6YihanCao,SiyuLi,YixinLiu,ZhilingYan,YutongDai,PhilipS.Yu,andLichaoSun
T2
Trm
Trm
E2
Encoder(BERT)
TN
Trm
Trm
EN
Decoder(GPT)
TN
Trm
Trm
Trm
EN
Encoder-Decoder(T5/BART)
Fig.4.Categoriesofpre-trainedLLMs.Blacklinerepresentsinformationflowinbidirectionalmodels,whilegraylinerepresentasleft-to-rightinformationflow.Encodermodels,e.g.BERT,aretrainedwithcontext-awareobjectives.Decodermodels,e.g.GPT,aretrainedwithautoregressiveobjectives.Encoder-decodermodels,e.g.T5andBART,combinesthetwo,whichusecontext-awarestructuresasencodersandleft-to-rightstructuresasdecoders.
predictiontasks.RoBERTa[
43
],whichusesthesamearchitectureasBERT,improvesitsperformancebyincreasingtheamountofpre-trainingdataandincorporatingmorechallengingpre-trainingobjectives.XL-Net[
44
],whichisalsobasedonBERT,incorporatespermutationoperationstochangethepredictionorderforeachtrainingiteration,allowingthemodeltolearnmoreinformationacrosstokens.Whileautoregressivelanguagemodeling,e.g.,GPT-3[
9
]andOPT[
45
],istomodeltheprobabilityofthenexttokengivenprevioustokens,hence,left-to-rightlanguagemodeling.Differentfrommaskedlanguagemodels,autoregressivemodelsaremoresuitableforgenerativetasks.WewillintroducemoreaboutautoregressivemodelsinSection
4.1.1
.
3.2ReinforcementLearningfromHumanFeedback
Despitebeingtrainedonlarge-scaledata,theAIGCmaynotalwaysproduceoutputthatalignswiththeuser’sintent,whichincludesconsiderationsofusefulnessandtruthfulness.InordertobetteralignAIGCoutputwithhumanpreferences,reinforcementlearningfromhumanfeedback(RLHF)hasbeenappliedtofine-tunemodelsinvariousapplicationssuchasSparrow,InstructGPT,andChatGPT[
10
,
46
].
Typically,thewholepipelineofRLHFincludesthefollowingthreesteps:pre-training,rewardlearning,andfine-tuningwithreinforcementlearning.First,alanguagemodel0ispre-trainedonlarge-scaledatasetsasaninitiallanguagemodel.Sincethe(prompt-answer)pairgivenby0mightnotalignwithhumanpurposes,inthesecondstepwetrainarewardmodeltoencodethediversifiedandcomplexhumanpreference.Specifically,giventhesamepromptx,differentgeneratedanswers{g1,g2,···,g3}areevaluatedbyhumansinapairwisemanner.Thepairwisecomparisonrelationshipsarelatertransferredtopointwiserewardscalars,{r1,r2,···,r3},usinganalgorithmsuchasELO[
47
].Inthefinalstep,thelanguagemodelisfine-tunedtomaximizethelearnedrewardfunctionusingreinforcementlearning.TostabilizetheRLtraining,ProximalPolicyOptimization(PPO)isoftenusedastheRLalgorithm.IneachepisodeofRLtraining,anempirically-estimatedKLpenaltytermisconsideredtopreventthemodelfromoutputtingsomethingpeculiartotricktherewardmodel.Specifically,thetotalrewardrtotalateachstepisgivenbyrtotal(x,g)=rRM(x,g)?入KLDKL(幾e|幾e0),whererRMisthelearnedrewardmodel,DKListheKLpenaltyterm,and幾·isthetrainedpolicy.FormoredetailsonRLHF,pleasereferto[
48
].
J.ACM,Vol.37,No.4,Article111.Publicationdate:August2018.
5
Imagen
GLIDE
RoBERTa
BA
TrainingSpeed(basedonV10016G)
Switch
7x
GPT-3
PaLM
BL
OMChatG
PT
T
Megatron
DALL-E
5x
3x
GPT-2
DALL-E-2
1x
GPT
BERT
VisualBERT
ERNIE
RT
CLIP
9x
O
AComprehensiveSurveyofAI-GeneratedContent(AIGC):
AHistoryofGenerativeAIfromGANtoChatGPT
111:7
#Parameters1T
100B
10B
1B
100M
2018
RTX
V100
2019RTX
RTX
2020
A100
A100
2021
H100
2022
H100
2023
8000
16G
3090
4090
40G
80G
80GGen5
80GSXM5
Fig.5.Statisticsofmodelsize[
52
]andtrainingspeed
1
acrossdifferentmodelsandcomputingdevices.
AlthoughRLHFhasshownpromisingresultsbyincorporatingfluency,progressinthisfieldisimpededbyalackofpubliclyavailablebenchmarksandimplementationresources,leadingtoaperceptionthatRLisachallengingapproachforNLP.Toaddressthisissue,anopen-sourcelibrarynamedRL4LMs[
49
]hasrecentlybeenintroduced,consistingofbuildingblocksforfine-tuningandevaluatingRLalgorithmsonLM-basedgeneration.
Beyondhumanfeedback,thelatestdialogueagent,Claude,favorsConstitutionalAI[
50
],wheretherewardmodelislearnedviaRLfromAIFeedback(RLAIF).BoththecritiquesandtheAIfeedbackareguidedbyasmallsetofprinciplesdrawnfroma“constitution”,whichistheonlythingprovidedbyhumansinClaude.TheAIfeedbackfocusesoncontrollingtheoutputstobelessharmfulbyexplainingitsobjectionstodangerousqueries.Moreover,recentlyapreliminarytheoreticalanalysisoftheRLAIF[
51
]justifiestheempiricalsuccessofRLHFandprovidesnewinsightsforspecializedRLHFalgorithmdesignforlanguagemodels.
3.3Computing
3.3.1Hardware.Inrecentyears,therehavebeensignificanthardwareadvancementsthathavefacilitatedthetrainingoflarge-scalemodels.Inthepast,trainingalargeneuralnetworkusingCPUscouldtakeseveraldaysorevenweeks.However,withtheemergenceofmorepowerfulcomputingresources,thisprocesshasbeenacceleratedbyseveralordersofmagnitude.Forinstance,theNVIDIAA100GPUachievesseventimesfasterduringBERT-largeinferencecomparedtotheV100and11timesfasterthanthe
T42.
Additionally,Google’sTensorProcessingUnits(TPUs),whicharedesignedspecificallyfordeeplearning,offerevenhighercomputingperformancecomparedtothecurrentgenerationofA100GPUs
3
.ThisrapidprogressincomputingpowerhassignificantlyincreasedtheefficiencyofAImodeltrainingandopened
溫馨提示
- 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
- 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
- 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
- 5. 人人文庫網(wǎng)僅提供信息存儲空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對任何下載內(nèi)容負(fù)責(zé)。
- 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時也不承擔(dān)用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。
最新文檔
- 宏觀經(jīng)濟視角下的教育資源配置
- 2025年建設(shè)項目管理軟件項目可行性研究報告
- 2025年大銅章項目可行性研究報告
- 2025年分體推焦車項目可行性研究報告
- 渠道融合與批發(fā)市場布局-深度研究
- 社交技能中的有效朗讀方法探討
- 消費者忠誠度分析-深度研究
- 2025至2030年單擺運動規(guī)律演示器項目投資價值分析報告
- 動態(tài)場景視頻摘要方法-深度研究
- 2025年電弧碳硫分析儀項目可行性研究報告
- 急診酒精中毒護(hù)理查房
- 施耐德低壓電器選型
- 2023城鎮(zhèn)給水排水管道原位固化法修復(fù)工程技術(shù)規(guī)程
- 碳纖維加固定額B013
- 脊柱外科進(jìn)修匯報
- 測繪工程產(chǎn)品價格表匯編
- 拘留所教育課件02
- 語言和語言學(xué)課件
- 裝飾圖案設(shè)計-裝飾圖案的形式課件
- 護(hù)理學(xué)基礎(chǔ)教案導(dǎo)尿術(shù)catheterization
- ICU護(hù)理工作流程
評論
0/150
提交評論