Design-SpaceExplorationofReconfigurable
HardwareforAnalysis
ShashidharMysore,SusmitBiswas{shashimc,susmit}@cs.ucsb.edu
I.LiteratureSurvey
ContentAddressableMemories(CAMs),andspecificallyTernaryCAMs(TCAMs),arememoriesthataremostlyusedinnetworkingdevices.CAMsprovidereadandwritesuchasanormalmemory,butadditionallysupportsearchwhichwillfindtheindexofanymatchingdataintheentirememory.ATCAMinparticularcanincludewildcardbitswhichwillmatchbothoneandzero.Thesewildcardscanbeusedonboththeaccessoperationsofthememory(indicatingsomebitsofthesearchare“don’tcares”)orcanbestoredwiththedataitself(indicatingsomebitsofthedatashouldnotbeusedfordeterminingamatch).ThefullyparallelsearchprovidedbyTCAMeasestheimplementationofmanycomplexoperationssuchasroutingtablelookup.BecausetheTCAMsearcheseverylocationinmemoryatonce,theorderingoftheelementsintheTCAMislessimportantandlargeindexingstructurescanoftentimesbeentirelyavoided.Thisparallelsearchdirectlyimplementstherequirementsofsomeapplications(suchasIP-lookup[9],[11],[6],[17]andcanserveasthebuildingblockofmorecomplexsearchingschemes[13].TCAMisalsousedinotherhigh-speednetworkingapplicationssuchaspacketclassification[4],[6],[13],accesslistcontrol,patternmatchingforintrusiondetection[16].
[7]exploresthreedifferentdesignscenariosofContentAddressableMemoriesinFPGA.ThethreedesignsarebasedonRegisters,RAMblocks,andLUTs.AlthoughtheimplementationofCAMisdifferentfromthatofaTernaryCAM,[7]providessomedirectiontowardstheevaluationofCAMimplementationsinFPGAwhichcanformabasisforevaluatingourTCAMimplementationonFPGA.TCAMsarepower-hungry,hence,ingeneral,mostofTCAMresearchfocussesonreducingthepowerconsumptionofTCAMs.RP-TCAM[14]ismotivatedfromthefactthatdynamicpowerconsumptioninTCAMsishugeduetothefrequentcharginganddischargingofthehighlycapacitivematchlines.Aselectiveprechargeschemeisproposedin[14].Withthis,amatchlineischargedonlyifthereisanexactmatchinthefirstfourbitsofTCAMword.Achievingasearchtimeof1.86ns,RP-TCAMconsumes80%lesserpower.
AlthoughTCAMsareveryusefulinhigh-speedapplications,manysystemdesignersworryaboutitspowerdissipation/consumption.AnumberoftechniqueshavebeenproposedtoreduceTCAMpowerconsumption[9],[11],[12]bysearchinginonlyasubsetoftheTCAM.Aseriesofprogramprofilingapplicationsandtechniques[8]anddataflowtrackingmechanisms[1],[15]canbeacceler-atedbyfastlookupswiththeeverymechanismpossibletosavepowerusingTCAMs.InCoolCAM[9],theauthorsassumethatthepowerconsumptionisproportionaltothenumberofrows.Theyprovideasetofcleveralgorithmsandatwo-levelTCAMdesignwhichrequiressearchinglessnum-berofrows,whichinturnreducesthetotalpowerconsumptionforsearchoperation.Althoughwefindtheirassumptionstobeagoodapproximationformakingrelativeestimations,anabsolutequantitativefigureforpowersavingintermsofJoulesorWattswouldbebetter.InEaseCAM[11],apagebasedschemeisusedtoreducethepowerconsumptionandtheybasetheirsavingsonthe
2DESIGN-SPACEEXPLORATIONOFRECONFIGURABLEHARDWAREFORANALYSIS
CAMimplementationinsideCacti(whichisdifferentthanaTCAM).Inadditiontotheserowprun-ingtechniques,thereisalsosomeworkwhichproposestoreducethenumberofbitsincomparison[5].WeplantoinvestigatetheopportunitytoimproveuponthestateoftheartimplementationofTCAMsuchthattheyprovidecomparablepowerperformancetoBloomfilterwithoutlosingthelatencybenefit.
[10]proposesapipelinedCAMimplementationagainmotivatedbythefactthatthereisaneedforpowerreductioninCAMs.Inthis,thematchlineisbrokenintosmallersegmentsandthesegmentsarepoweredbasedonthematchintheprevioussegments,hencereducingpowerconsumption.[3]describesadynamicTCAMarchitecturewithplanarcomplementarycapacitors,transparentlyscheduledrefresh(TSR),autonomouspowermanagement(APM)andaddress-input-freewritingscheme.Onceagain,thispaperfocusesonnetworkingapplicationswhichneedtouselow-powerembeddedTCAMmacros.TheproposedcomplementarycellstructureoftheplanardynamicTCAM(PD-TCAM)allowssmallcellsizeof4.79m2in130nmCMOStechnology,andrealizesstableTCAMoperationevenwithverysmallstoragecapacitance.FPGAmanufacturingcompaniessuchasXilinx[2],produceTCAMIPcoreimplementationwhoseexactdesignsarenotopenlyavailable.HenceweintendtocompareourdesignwiththeirIPcores.
Allofrelatedworkreferredtointhissectioncontributetowardseithermotivatingourprojectintermsoftheapplicationstheydescribeandthehardwarerequirementstheseapplicationshave,ortheydescribeimplementationsandevaluationsofCAM/TCAMarchitectures.Ourworkwillborrowtheevaluationframeworkfromsometheworkdescribed,whilewewouldliketoproposeourownbit-vectorbasedTCAMimplementationspecializedforimplementationonFPGA.
References
[1]J.R.CrandallandF.T.Chong.Minos:ControlDataAttackPreventionOrthogonaltoMemoryModel.In
MICRO37:Proceedingsofthe37thannualIEEE/ACMInternationalSymposiumonMicroarchitecture,pages221–232,Washington,DC,USA,2004.IEEEComputerSociety.[2]X.Inc.Xilinxtcamcore.”http://xilinx.com/”.
[3]K.e.a.INOUE.Embeddedlow-powerdynamictcamarchitecturewithtransparentlyscheduledrefresh.IEICE
transactionsonelectronics,88(4):622–629,2005.
[4]M.Kounavis,A.Kumar,H.Vin,R.Yavatkar,andA.Campbell.DirectionsinPacketClassificationforNetwork
Processors.InProc.ofNetworkProcessorWorkshopinconjunctionwithNinthInternationalSymposiumonHighPerformanceComputerArchitecture(HPCA-9),pages10–22,Anaheim,CA,Feb.2003.
[5]X.Li,Z.Liu,W.Li,andB.Liu.SCP-TCAM:APower-EfficientSearchEngineforfastIPLookup.InISBN
Proceedings,2004.
[6]H.Liu.EfficientMappingofRangeClassifierintoTernary-CAM.In10thSymposiumonHighPerformance
InterconnectsHOTInterconnects(HotI’02),Stanford,CA,August2002.
[7]K.McLaughlin,N.O’Connor,andS.Sezer.Exploringcamdesignfornetworkprocessingusingfpgatechnology.
InAICT-ICIW’06:ProceedingsoftheAdvancedInt’lConferenceonTelecommunicationsandInt’lConferenceonInternetandWebApplicationsandServices,page84,Washington,DC,USA,2006.IEEEComputerSociety.[8]S.Mysore,B.Agrawal,T.Sherwood,N.Shrivastava,andS.Suri.ProfilingoverAdaptiveRanges.InProceedings
oftheFourthInternationalSymposiumonCodeGenerationandOptimization(CGO-4),pages147–158,NewYork,NY,USA,March2006.IEEEComputerSociety.
[9]G.J.Narlikar,A.Basu,andF.Zane.CoolCAMs:Power-EfficientTCAMsforForwardingEngines.InIEEE
INFOCOM:TheConferenceonComputerCommunications,2003.
[10]K.PagiamtzisandA.Sheikholeslami.Alow-powercontent-addressablememory(CAM)usingpipelinedhierar-chicalsearchscheme.IEEEJournalofSolid-StateCircuits,39(9):1512–1519,September2004.
[11]V.C.Ravikumar,R.Mahapatra,andL.Bhuyan.EaseCAM:AnEnergyandStorageEfficientTCAM-Based
RouterArchitectureforIPLookup.IEEETrans.Comput.,54(5):521–533,2005.
[12]S.SharmaandR.Panigrahy.ReducingTCAMPowerConsumptionandIncreasingThroughput.In10th
SymposiumonHighPerformanceInterconnectsHOTInterconnects(HotI’02),Stanford,CA,August2002.
Shashidhar,Susmit:DESIGN-SPACEEXPLORATIONOFRECONFIGURABLEHARDWAREFORANALYSIS3
[13]E.Spitznagel,D.Taylor,andJ.Turner.PacketClassificationUsingExtendedTCAMs.In11thIEEEInterna-tionalconferenceonnetworkprotocols(ICNP),2003.
[14]D.S.Vijayasarathi,M.Nourani,M.J.Akhbarizadeh,andP.T.Balsara.Ripple-prechargetcamalow-power
solutionfornetworksearchengines.iccd,00:243–248,2005.
[15]M.Xu,R.Bodik,andM.D.Hill.A”FlightDataRecorder”forEnablingFull-SystemMultiprocessorDetermin-isticReplay.InISCA’03:Proceedingsofthe30thAnnualInternationalSymposiumonComputerArchitecture,pages122–135,NewYork,NY,USA,2003.ACMPress.
[16]F.Yu,R.H.Katz,andT.V.Lakshman.GigabitRatePacketPattern-MatchingUsingTCAM.In11thIEEE
Internationalconferenceonnetworkprotocols(ICNP),Berlin,Germany,2004.
[17]K.Zheng,C.Hu,H.Lu,andB.Liu.Anultrahighthroughputandpowerefficienttcam-basediplookupengine.
InINFOCOM,pages1984–1994,2004.
因篇幅问题不能全部显示,请点此查看更多更全内容