one pass of each onein...

High-performance sorting on networks of workstations_文档库
文档库最新最全的文档下载
当前位置: & High-performance sorting on networks of workstations
High-performance sorting on networks of workstations
One-Pass Single-Node
(Section 4)
One-Pass ParallelTwo-Pass Single-Node
(Section 5)
(Section 6)
(Section 7)
Figure1:DevelopmentofSortingAlgorithm.Thesectionsofthepaperfollowthesamedevelopmentastheprogressionofsortingalgorithms,fromthesimplesttoourmostcomplex.wheremorecomplexversionsbuildfromthesimplerones.Wewillshowthatbyfirstunderstandingandtuningapplica-tionperformanceinsimpleconfigurations,weareabletobuildhighly-scalableparallelversionswithlittleeffort.OurbestDatamationbenchmarktimeisgivenfortheone-passparallelversioninSection5,andthebestMinuteSortresultsforthetwo-passparallelsortinSection7.WepresentourconclusionsinSection8.
TheDatamationsortingbenchmarkwasintroducedin1985byagroupofdatabaseexpertsasatestofaprocessor’sI/Osubsystemandoperatingsystem[17].Theperformancemetricofthisbenchmarkistheelapsedtimetosortonemillionrecordsfromdisktodisk.Therecordsbeginondiskandareeach100-bytes,wherethefirst10-bytesarethekey.Thus,withonemillion100-byterecords,95MBofdataarereadandwrittenfromdisk.Theelapsedtimeincludesthetimetolaunchtheapplication,open,create,andcloseallfiles,ensuretheoutputresidesondisk,andtoterminatetheprogram.Price-performanceofthehardwareandsoftwareiscomputedbypro-ratingthefive-yearcostoverthetimeofthesort.Thepreviousrecord-holderonthisbenchmarkwasa12processorSGIChallengewith96disksand2.25GBofmainmemory[30]at3.52asingleprocessorIBMRS/6000with8disksand256MBofmemoryhasanimpressivetimeof5.1secondsandbetterprice/performance,butusesrawdisk,whichisnotallowedinthebenchmark[1].
RecognizingthattheDatamationbenchmarkisoutdatedandismoreatestofstartupandshutdowntimethanI/Oper-formance,theauthorsofAlphaSortintroducedMinuteSortin1994[26].ThekeyandrecordspecificationsareidenticaltothatofDtheperformancemetricisnowtheamountofdatathatcanbesortedinoneminuteofelapsedtime.Price-performanceiscalculatedfromthelistpriceofthehardwareandoperatingsystemdepreciatedoverthreeyears.TheSGIsystemwasalsothepreviousrecord-holderontheMinuteSortbenchmark,sorting1.6GB.AlphaSortachieved1.1GBononlythreeprocessors,36disks,and1.25GBofmemory,forbetterprice/performance[26].
Overtheyears,numerousauthorshavereportedtheperfor-manceoftheirsortingalgorithmsandimplementations[1,6,7,15,16,21,25,26,27,30,35],andwetrytoleveragemanyoftheimplementationandalgorithmiclessonsthattheydescribe.
Onedifferencebetweenmostofthisworkandoursisthatweprovidemeasurementsforarangeofsystemconfigurations,varyingthenumberofprocessors,thenumberofdisksperma-chine,andamountofmemory.Anotherdifferenceisthatourenvironmentisoneofthefewparallelconfigurationswhereeachnodeisacompletesystem,withitsownvirtualmemorysystem,disks,andfilesystem.
Twodifferentclusterenvironmentsformourexperimentaltest-bed.Thefirstconsistsof64commodityUltraSPARCIwork-stations,eachwith64MBofmemory(however,mostmeasure-mentsonlyextendto32nodesduetotimeconstraints).Each
workstationhousestwointernal5400RPMSeagateHawkdisksonasinglefast-narrowSCSIbus.Notethatwithonlytwodiskspermachine,wecannotaffordtodedicateasparediskforpagingactivity.UltraEnterpriseIModel170
$15,495$16perMB
Internal5400RPMSeagateHawk
Enclosurefor8ExternalDisks
Myrinet4.1-M2FCard
(8x128MB,8x25400RPMDisks,4DiskEnclosures,8MyrinetCards,
(64x64MB,64x25400RPMDisks,
$1,190,080
Table1:HardwareListPrices.October1996listprices.Thesecondclusterconnectseightmorefully-equippedUl-traSPARCIModel170workstations.Eachcontains128MBofmainmemoryandanextrafast-wideSCSIcard,withtwo7200RPMSeagateBarracudaexternaldisksattached.Thus,whilethisclusteronlycontainsoneeighthoftheprocessorsintheotherconfiguration,itcontainsonequarterofthenumberofdisksandamountofmemory.
ThemainlessontaughtbytheauthorsofAlphaSort[26]isthatevenlargesortingproblemsshouldbeperformedinasinglepass,sinceonlyhalftheamountofdiskI/Oisperformedandthepriceofmemoryisrelativelylow.Allpreviousrecord-holdersonboththeDatamationandMinuteSortbenchmarkswereabletosorttherecordsinasingle-pass.However,asweshallsee,ourNOWconfigurationismemory-starved,soweperformtheMinuteSortbenchmarkintwopasses.
Inadditiontotheusualconnectiontotheoutsideworldvia10Mb/sEthernet,everyworkstationcontainsasingleMyrinetnetworkcard.Myrinetisaswitch-based,high-speed,local-areanetwork,withlinkscapableofbi-directionaltransferratesof160MB/s[10].EachMyrinetswitchhaseightports,andthe64-nodeclusterisconstructedbyconnecting26oftheseswitchesina3-arytree.
Word文档免费下载:
(共12页)
High-performance sorting... 暂无评价 12页 免费 Efficient Implementation...... of MPI for network of workstations connected with a Myrinet network. ... High-performance sorting... 暂无评价 12页 免费 Performance of TCP Exten....(HPC) is to use networks of workstations (NOW) as a cheaper alternative ... 暂无评价 9页 免费 High-performance sorting... 暂无评价 12页 免费... The Remote Enqueue Operation on Networks of Workstations Evangelos P. Mark... High-performance sorting... 暂无评价 12页 免费 Parallel Application Sch..... Lightweight Transactions on Networks of Workstations Athanasios E. Papathanasi... The effects of high-performance processors, real-time priorities and high-speed networks on jitter in a multimedia streamThe effects of high-performance ...We study two high-performance parallel sorting algorithms, radix and sample ...studied the performance of diskto-disk sorting on clusters of workstations ...DOCSIS 3.0 refers to a set of requirements that enhance services and performance capabilities in the DOCSIS network. The highest profile objective of DOCSIS...VIP-FS A Virtual Parallel File System for High Performance Parallel and ... programs for execution on more readily available networks of workstations. ...high performance algorithms on clusters of symmetric...sorting integers, two-dimensional fast Fourier ...(SMP), ATM Networks, Parallel 1 Problem Overview... but their high diameter means that the fastest time of any algorithm on ... 4 we discuss the performance of Graphsort on other types of networks. ...Control of smoothness in non-linear filtering of geomagnetic data by a one-pass method with a piecewise cubic polynomial - Itonaga - 1997 - Geophysical Journal International - Wiley Online Library
Advertisement
Advertisement
A smoother realized by a one-pass method with a piecewise cubic polynomial and a modified Powell criterion is termed a PCP filter. Since its action is data-adaptive, the PCP filter is non-linear. Furthermore, this filler is also non-causal and in practice causes little phase shift in the output and/or residual data as long as the input data has no dominant components with frequencies within its transition band. An average amplitude response of the PCP filter for sinusoidal inputs is determined numerically for each value of the parameter &, which appears in the modified Powell criterion and controls the smoothness in output of the filter. An objective criterion for the selection of & in the practical application of the PCP filter is then obtained from this response. Furthermore, the cut-off frequency of the PCP filter is defined as the first frequency at which the attenuation of the average amplitude response becomes 3 dB. It is found that the cut-off frequency is a good measure of the smoothness in output of the PCP filter. This frequency decreases with increasing &, meaning that the output of the PCP filter becomes smoother and the oscillations with longer periods can be separated as residuals as & grows large. As & increases, the average number of sample points between two adjacent knots becomes larger and the computational cost increases for data processing by the PCP filter.

我要回帖

更多关于 each one 的文章

 

随机推荐