How big are PHP arrays (and values) really?
2012-08-05 00:00
363 查看
InthispostIwanttoinvestigatethememoryusageofPHParrays(andvaluesingeneral)usingthefollowingscriptasanexample,whichcreates100000uniqueintegerarrayelementsandmeasurestheresultingmemoryusage:
<?php
$startMemory=memory_get_usage();
$array=range(1,100000);
echomemory_get_usage()-$startMemory,'bytes';
Howmuchwouldyouexpectittobe?Simple,oneintegeris8bytes(ona64bitunixmachineandusingthelongtype)andyougot100000integers,soyouobviouslywillneed800000bytes.That’ssomethinglike0.76MBs.
Nowtryandruntheabovecode.Youcandoitonlineifyouwant.Thisgivesme14649024bytes.Yes,youheardright,that’s13.97MB-eightteentimesmorethanweestimated.
So,wheredoesthatextrafactorof18comefrom?
Theabovenumberswillvarydependingonyouroperatingsystem,yourcompilerandyourcompileoptions.E.g.ifyoucompilePHPwithdebugorwiththread-safety,youwillgetdifferentnumbers.ButIthinkthatthesizesgivenabovearewhatyouwillseeonanaverage64-bitproductionbuildofPHP5.3onLinux.
Ifyoumultiplythose144bytesbyour100000elementsyouget14400000bytes,whichis13.73MB.That’sprettyclosetotherealnumber-therestismostlypointersforuninitializedbuckets,butI’llcoverthatlater.
Now,ifyouwanttohaveamoredetailedanalysisofthevaluesmentionedabove,readon:)
The
FirsthavealookathowPHPstoresvalues.AsyouknowPHPisaweaklytypedlanguage,soitneedssomewaytoswitchbetweenthevarioustypesfast.PHPusesaunionforthis,whichisdefinedasfollowsinzend.h#307(commentsmine):
typedefunion_zvalue_value{
longlval;//Forintegersandbooleans
doubledval;//Forfloats(doubles)
struct{//Forstrings
char*val;//consistingofthestringitself
intlen;//anditslength
}str;
HashTable*ht;//Forarrays(hashtables)
zend_object_valueobj;//Forobjects
}zvalue_value;
Ifyoudon’tknowC,thatisn’taproblemasthecodeisprettystraightforward:Aunionisameanstomakesomevalueaccessibleasvarioustypes.Forexampleifyoudoazvalue_value->lvalyou’llgetthevalueinterpretedasaninteger.Ifyouusezvalue_value->htontheotherhandthevaluewillbeinterpretedasapointertoahashtable(akaarray).
Butlet’snotgettoomuchintothishere.Importantforusonlyisthatthesizeofaunionequalsthesizeofitslargestcomponent.Thelargestcomponenthereisthestringstruct(thezend_object_valuestructhasthesamesizeasthestrstruct,butI’llleavethatoutforsimplicity).Thestringstructstoresapointer(8bytes)andaninteger(4bytes),whichis12bytesintotal.Duetomemoryalignment(structswith12bytesaren’tcoolbecausetheyaren’tamultipleof64bits/8bytes)thetotalsizeofthestructwillbe16bytesthoughandthatwillalsobethesizeoftheunionasawhole.
Sonowweknowthatwedon’tneed8bytesforeveryvalue,but16-duetoPHP’sdynamictyping.Multiplyingby100000valuesgivesus1600000bytes,i.e.1.53MB.Buttherealvalueis13.97MB,sowecan’tbethereyet.
The
Andthisisquitelogical-theuniononlystoresthevalueitself,butPHPobviouslyalsoneedstostoreitstypeandsomegarbagecollectioninformation.Thestructureholdingthisinformationiscalledazvalandyoumayhavealreadyhaveheardofit.FormoreinformationonwhyPHPneedsitI’drecommendtoreadanarticlebySaraGolemon.Anyways,thisstructisdefinedasfollows:
struct_zval_struct{
zvalue_valuevalue;//Thevaluezend_uint
refcount__gc;//Thenumberofreferencestothisvalue(forGC)
zend_uchartype;//Thetypezend_uchar
is_ref__gc;//Whetherthisvalueisareference(&)
};
Thesizeofastructisdeterminedbythesumofthesizesofitscomponents:Thezvalue_valueis16bytes(ascomputedabove),thezend_uintis4bytesandthezend_ucharsare1byteeach.That’satotalof22bytes.Againduetomemoryalignmenttherealsizewillbe24bytesthough.
Soifwestore100000elementsá24bytesthatwouldbe2400000intotal,whichis2.29MB.Thegapisclosing,buttherealvalueisstillmorethansixtimeslarger.
garbagecollectorforcyclicreferences.ForthistoworkPHPhastostoresomeadditionaldata.Idon’twanttoexplainhowthealgorithmworkshere,youcanreadthatuponthelinkedpagefromthemanual.ImportantforoursizecalculationsisthatPHPwillwrapeveryzvalintoazval_gc_info:
typedefstruct_zval_gc_info{
zvalz;
union{
gc_root_buffer*buffered;
struct_zval_gc_info*next;
}u;
}zval_gc_info;
AsyoucanseeZendonlyaddsaunionontopofit,whichconsistsoftwopointers.Asyouhopefullyrememberthesizeofaunionisthesizeofitslargestcomponent:Bothunioncomponentsarepointers,thusbothhaveasizeof8bytes.Sothesizeoftheunionis8bytestoo.
Ifweaddthatontopofthe24byteswealreadyhaveweget32bytes.Multiplythatbythe100000elementsandwegetamemoryusageof3.05MB.
TheZendMemoryManager.TheZendMMisbasedonDougLea’smallocandaddssomePHPspecificoptimizationsandfeatures(likememorylimit,cleaningupaftereachrequestandstufflikethat).
WhatisimportantforushereisthattheMMaddsanallocationheadertoeveryallocationdonethroughit.Itisdefinedasfollows:
typedefstruct_zend_mm_block{
zend_mm_block_infoinfo;
#ifZEND_DEBUG
unsignedintmagic;
#ifdefZTS
THREAD_Tthread_id;
#endif
zend_mm_debug_infodebug;
#elifZEND_MM_HEAP_PROTECTION
zend_mm_debug_infodebug;
#endif
}zend_mm_block;
typedefstruct_zend_mm_block_info{
#ifZEND_MM_COOKIES
size_t_cookie;
#endif
size_t_size;//sizeoftheallocation
size_t_prev;//previousblock(notsurewhatexactlythisis)
}zend_mm_block_info;
Asyoucanseethedefinitionsareclutteredwithlotsofcompileoptionchecks.SoifoneofthoseoptionsissettheallocationheaderwillbebiggerandwillbelargestifyoubuildPHPwithheapprotection,multi-threading,debugandMMcookies.
Forthisexamplethoughwewillassumethatallthoseoptionsaredisabled.Inthatcasetheonlythingleftarethetwosize_ts_sizeand_prev.Asize_thas8bytes(on64bit),sotheallocationheaderhasatotalsizeof16bytes-andthatheaderisaddedoneveryallocation.
Sonowweneedtoadjustourzvalsizeagain.Inrealityitisn’t32bytes,butit’s48,duetothatallocationheader.Multipliedbyour100000elementsthat’s4.58MB.Therealvalueis13.97MB,sowealreadygotapproximatelyathirdcovered.
zend_hash.h#54):
typedefstructbucket{
ulongh;//Thehash(orforintkeysthekey)
uintnKeyLength;//Thelengthofthekey(forstringkeys)
void*pData;//Theactualdata
void*pDataPtr;//???What'sthis???
structbucket*pListNext;//PHParraysareordered.Thisgivesthenextelementinthatorder
structbucket*pListLast;//andthisgivesthepreviouselement
structbucket*pNext;//Thenextelementinthis(doubly)linkedlist
structbucket*pLast;//Thepreviouselementinthis(doubly)linkedlist
constchar*arKey;//Thekey(forstringkeys)
}Bucket;
AsyoucanseeoneneedstostoreloadsofdatatogetthekindofabstractarraydatastructurethatPHPuses(PHParraysarearrays,dictionariesandlinkedlistsatthesametime,thatsureneedsmuchinfo).Thesizesoftheindividualcomponentsare8bytesfortheunsignedlong,4bytesfortheunsignedintand7times8bytesforthepointers.That’satotalof68.Addalignmentandyouget72bytes.
Bucketslikezvalsneedtobeallocatedonthehead,soweneedtoaddthe16bytesfortheallocationheaderagain,givingus88bytes.Alsoweneedtostorepointerstothosebucketsinthe“real”Carray(Bucket**arBuckets;)Imentionedabove,whichaddsanother8bytesperelement.Soallinalleverybucketneeds96bytesofstorage.
Soifweneedabucketforeveryvalue,that’s96bytesforthebucketand48bytesforthezval,whichis144bytesintotal.For100000elementsthat’s14400000bytesaka13.73MB.
Mysterysolved.
Inourcaseitis2^17=131072.Butweneedonly100000ofthosebuckets,soweareleaving31072bucketsunused.Thosebucketswillnotbeallocated(sowedon’tneedtospendthefull96bytes),butthememoryforthebucketpointer(theonestoredintheinternalbucketarray)stillneedstobeallocated.Soweadditionallyuse8bytes(apointer)*31072elements.Thisis248576bytesor0.23MB.Thatmatchesthemissingmemory.(Sure,therearestillafewbytesmissing,butIdon’treallywanttocoverthere.Theyarethingslikethehashtablestructureitself,variables,etc.)
Mysteryreallysolved.
ButifyoudowanttosavememoryyoucouldconsiderusinganSplFixedArrayforlarge,staticarrays.
Havealookathismodifiedscript:
<?php
$startMemory=memory_get_usage();
$array=newSplFixedArray(100000);
for($i=0;$i<100000;++$i){
$array[$i]=$i;
}
echomemory_get_usage()-$startMemory,'bytes';
Itbasicallydoesthesamething,butifyourunit,you’llnoticethatituses“only”5600640bytes.That’s56bytesperelementandthusmuchlessthanthe144bytesperelementanormalarrayuses.Thisisbecauseafixedarraydoesn’tneedthebucketstructure:Soitonlyrequiresonezval(48bytes)andonepointer(8bytes)foreachelement,givingustheobserved56bytes.
Howmuchwouldyouexpectittobe?Simple,oneintegeris8bytes(ona64bitunixmachineandusingthe
Nowtryandruntheabovecode.
So,wheredoesthatextrafactorof18comefrom?
Summary
Forthosewhodon’twanttoknowthefullstory,hereisaquicksummaryofthememoryusageofthedifferentcomponentsinvolved:|64bit|32bit
---------------------------------------------------
zval|24bytes|16bytes
+cyclicGCinfo|8bytes|4bytes
+allocationheader|16bytes|8bytes
===================================================
zval(value)total|48bytes|28bytes
===================================================
bucket|72bytes|36bytes
+allocationheader|16bytes|8bytes
+pointer|8bytes|4bytes
===================================================
bucket(arrayelement)total|96bytes|48bytes
===================================================
totaltotal|144bytes|76bytes
Theabovenumberswillvarydependingonyouroperatingsystem,yourcompilerandyourcompileoptions.E.g.ifyoucompilePHPwithdebugorwiththread-safety,youwillgetdifferentnumbers.ButIthinkthatthesizesgivenabovearewhatyouwillseeonanaverage64-bitproductionbuildofPHP5.3onLinux.
Ifyoumultiplythose144bytesbyour100000elementsyouget14400000bytes,whichis13.73MB.That’sprettyclosetotherealnumber-therestismostlypointersforuninitializedbuckets,butI’llcoverthatlater.
Now,ifyouwanttohaveamoredetailedanalysisofthevaluesmentionedabove,readon:)
Thezvalue_valueunion
FirsthavealookathowPHPstoresvalues.AsyouknowPHPisaweaklytypedlanguage,soitneedssomewaytoswitchbetweenthevarioustypesfast.PHPusesalonglval;//Forintegersandbooleans
doubledval;//Forfloats(doubles)
struct{//Forstrings
char*val;//consistingofthestringitself
intlen;//anditslength
}str;
HashTable*ht;//Forarrays(hashtables)
zend_object_valueobj;//Forobjects
}zvalue_value;
Ifyoudon’tknowC,thatisn’taproblemasthecodeisprettystraightforward:A
Butlet’snotgettoomuchintothishere.Importantforusonlyisthatthesizeofa
Sonowweknowthatwedon’tneed8bytesforeveryvalue,but16-duetoPHP’sdynamictyping.Multiplyingby100000valuesgivesus1600000bytes,i.e.1.53MB.Buttherealvalueis13.97MB,sowecan’tbethereyet.
Thezvalstruct
Andthisisquitelogical-theuniononlystoresthevalueitself,butPHPobviouslyalsoneedstostoreitstypeandsomegarbagecollectioninformation.Thestructureholdingthisinformationiscalledazvalue_valuevalue;//Thevaluezend_uint
refcount__gc;//Thenumberofreferencestothisvalue(forGC)
zend_uchartype;//Thetypezend_uchar
is_ref__gc;//Whetherthisvalueisareference(&)
};
Thesizeofastructisdeterminedbythesumofthesizesofitscomponents:The
Soifwestore100000elementsá24bytesthatwouldbe2400000intotal,whichis2.29MB.Thegapisclosing,buttherealvalueisstillmorethansixtimeslarger.
Thecyclescollector(asofPHP5.3)
PHP5.3introducedanewzvalz;
union{
gc_root_buffer*buffered;
struct_zval_gc_info*next;
}u;
}zval_gc_info;
AsyoucanseeZendonlyaddsaunionontopofit,whichconsistsoftwopointers.Asyouhopefullyrememberthesizeofaunionisthesizeofitslargestcomponent:Bothunioncomponentsarepointers,thusbothhaveasizeof8bytes.Sothesizeoftheunionis8bytestoo.
Ifweaddthatontopofthe24byteswealreadyhaveweget32bytes.Multiplythatbythe100000elementsandwegetamemoryusageof3.05MB.
TheZendMMallocator
CunlikePHPdoesnotmanagememoryforyou.Youneedtokeeptrackofyourallocationsyourself.ForthispurposePHPusesacustommemorymanagerthatisoptimizedspecifiallyforitsneeds:WhatisimportantforushereisthattheMMaddsanallocationheadertoeveryallocationdonethroughit.Itis
zend_mm_block_infoinfo;
#ifZEND_DEBUG
unsignedintmagic;
#ifdefZTS
THREAD_Tthread_id;
#endif
zend_mm_debug_infodebug;
#elifZEND_MM_HEAP_PROTECTION
zend_mm_debug_infodebug;
#endif
}zend_mm_block;
typedefstruct_zend_mm_block_info{
#ifZEND_MM_COOKIES
size_t_cookie;
#endif
size_t_size;//sizeoftheallocation
size_t_prev;//previousblock(notsurewhatexactlythisis)
}zend_mm_block_info;
Asyoucanseethedefinitionsareclutteredwithlotsofcompileoptionchecks.SoifoneofthoseoptionsissettheallocationheaderwillbebiggerandwillbelargestifyoubuildPHPwithheapprotection,multi-threading,debugandMMcookies.
Forthisexamplethoughwewillassumethatallthoseoptionsaredisabled.Inthatcasetheonlythingleftarethetwo
Sonowweneedtoadjustour
Buckets
Untilnowwehaveonlyconsideredsinglevalues.ButarraystructuresinPHPalsotakeuplotsofspace:“Array”actuallyisabadlychosentermhere.PHParraysarehashtables/dictionariesinreality.Sohowdohashtableswork?Basicallyforeverykeyahashisgeneratedandthathashisusedasanoffsetintoa“real”Carray.Asthehashescanclash,allelementsthathavethesamehasharestoredinalinkedlist.WhenaccessinganelementPHPfirstcomputesthehash,looksfortherightbucketandthetraversesthelinklist,comparingtheexactkey,elementbyelement.Abucketisdefinedasfollows(seeulongh;//Thehash(orforintkeysthekey)
uintnKeyLength;//Thelengthofthekey(forstringkeys)
void*pData;//Theactualdata
void*pDataPtr;//???What'sthis???
structbucket*pListNext;//PHParraysareordered.Thisgivesthenextelementinthatorder
structbucket*pListLast;//andthisgivesthepreviouselement
structbucket*pNext;//Thenextelementinthis(doubly)linkedlist
structbucket*pLast;//Thepreviouselementinthis(doubly)linkedlist
constchar*arKey;//Thekey(forstringkeys)
}Bucket;
AsyoucanseeoneneedstostoreloadsofdatatogetthekindofabstractarraydatastructurethatPHPuses(PHParraysarearrays,dictionariesandlinkedlistsatthesametime,thatsureneedsmuchinfo).Thesizesoftheindividualcomponentsare8bytesfortheunsignedlong,4bytesfortheunsignedintand7times8bytesforthepointers.That’satotalof68.Addalignmentandyouget72bytes.
Bucketslikezvalsneedtobeallocatedonthehead,soweneedtoaddthe16bytesfortheallocationheaderagain,givingus88bytes.Alsoweneedtostorepointerstothosebucketsinthe“real”Carray(
Soifweneedabucketforeveryvalue,that’s96bytesforthebucketand48bytesforthezval,whichis144bytesintotal.For100000elementsthat’s14400000bytesaka13.73MB.
Mysterysolved.
Wait,there’sanother0.24MBleft!
Thoselast0.24MBareduetouninitializedbuckets:ThesizeoftherealCarraystoringthebucketsshouldideallybeapproximatelythesameasthenumberofarrayelementsstored.Thiswayyouhavetheleastnumberofcollisions(unlessyouwanttowastelotsofmemory.)ButPHPobviouslycan’treallocatethewholearrayeverytimeanelementisadded-thatwouldbereeeallyslow.InsteadPHPalwaysdoublesthesizeoftheinternalbucketarrayifithitsthelimit.Sothesizeofthearrayisalwaysapoweroftwo.Inourcaseitis
Mysteryreallysolved.
Whatdoesthistellus?
PHPain’tC.That’sallthisshouldtellus.Youcan’texpectthatasuperdynamiclanguagelikePHPhasthesamehighlyefficientmemoryusagethatChas.Youjustcan’t.Butifyoudowanttosavememoryyoucouldconsiderusingan
Havealookathismodifiedscript:
$startMemory=memory_get_usage();
$array=newSplFixedArray(100000);
for($i=0;$i<100000;++$i){
$array[$i]=$i;
}
echomemory_get_usage()-$startMemory,'bytes';
相关文章推荐
- 大数据和人工智能如何改变网贷(How Big Data And Artificial Intelligence are Changing Online Lending)
- how websites are perceived by their visitors and the basic ways in which websites can be constructed.
- How to solve problem caused when the CRT library and MFC libraries are linked in the wrong order
- How to Add Columns to a DataGrid through Binding and Map Its Cell Values
- paper 157:文章解读--How far are we from solving the 2D & 3D Face Alignment problem?-(and a dataset of 230,000 3D facial landmarks)
- HOWTO disable screensaver and powermanager while mplayer or other apps are running
- how websites are perceived by their visitors and the basic ways in which websites can be constructed.
- 802.11 WDS how does it work, when to use it and what are the limitations
- How does ASM work with RAID where striping and mirroring are already built-in [ID 330398.1]
- how websites are perceived by their visitors and the basic ways in which websites can be constructed.
- Structure of a C-Program in Memory | How Heap,Stack,Data and Code segments are stored in memory?
- Give your contacts a big smile when you meet them. Make them feel that you are really happy to meet them.
- how websites are perceived by their visitors and the basic ways in which websites can be constructed.
- How are the icon files in my application bundle used on iPad and iPhone
- [转] How Poor We Really Are
- how websites are perceived by their visitors and the basic ways in which websites can be constructed.
- X$ Tables and how the names are derived .
- How to manage and balance “Huge Data Load” for Big Kafka Clusters---reference
- List of X$ Tables and how the names are derived
- How to make big and good