Overview

folder
parallel


Sheet 1: folder

F 9300tolly.pdf
3Com Corporation SuperStack II Switch 9300 Gigabit Ethernet Performance
F advanceInComputing00.pdf Jeffrey K. Hollingsworth Resource-Aware Meta-Computing
F allerton99.pdf Youngmi Joo On the impact of variability on the buffer dynamics in IP networks
F ata_torus.pdf Yu-Chee Tseng An Efficient Scheme for Complete Exchange in 2D Tori
F atlas.ps-Gsview
Automatically Tuned Linear Algebra Software R. Clint Whaley
F avalon.ps-Gsview Michael Warren Avalon: An Alpha/Linux Cluster Achieves 10 Gflops for Gflops for $150k
F Banerjee.pdf Subrata Banerjee Regular Multihop Logical Topologies For Lightwave Networks
F bin-ica3pp-but-cam.ps-Gsview Bin Zhou A Performance Comparison of Buffering Schemes for Multistage Switches
F bin-tencon-send.ps-Gsview Bin Zhou and M. Atiquzzaman Impact of Switch Architectures on the Performance of Multistage Interconnection Networks
F bsplib-ipdps99.pdf Stephen Donaldson Exploiting Global Structure for Performance on Clusters
F cc99-tom.ps-Gsview Thomas Warschko A Reliable Transmission Protocol for Myrinet
F ccr-9501-keshav91.pdf Srinivasan Keshav A Control-Theoretic Approach ti Flow Control
F cheat.pdf
Theoretical Computer Science Cheat Sheet
F chen-mun-4.ps-view Mohammed Atiquzzaman Realistic Modeling of Blocked Packets for Accurate Performance Evaluation of ATM Switches
F collapse_may99.pdf Sally Floyd and Kevin Fall Promoting the Use of End-to-End Congestion Control in the Internet
F comm.pdf
Communications technology 1999 analysis & forecast
F comm_overh_cm5.pdf Ravi Ponnusamy Communication Overhead on CM5 : An Experimental Performance Evaluation
F comm_sched_cm5.pdf Ravi Ponnusamy Scheduling Regular and Irregular Communication Patterns on the CM-5
F contr.atm.ps-GSview Philip K. McKinley Design of Collective Communication Operations on ATM Networks (Position Paper)
F cplant.ps-GSview Rolf Riesen Cplant
F cr5.ps-GSview Raj jain Congestion Aviodance in Computer Networks With a Connectionless Network Layer
F crit.ps- GSview Gabriel Loh A Critical Assessment of LogP : Towards a Realistic Model of Parallel Computation
F CutT_packSw.pdf N.M.A. AYAD and F.A.Mohamed Performance Analysis of a Cut-through vs Packet-switching Techniques
F dtjd01pf.pdf Robert Souza GIGAswitch System: A High-performance Packet-switching Platform
F dtjo01pf.pdf Zarka Cvetanovic AlphaServer 4100 Performance Characterization
F dynamic_cong_avoid.ps-Gsview Van Jacobson Dynamic Congestion Avoidance / Control (long message)
F effcomm_TE.pdf Satish Rao Efficient Communication Using Total-Exchange
F eval.ps P.H. Cams An Evaluation of Message passing Implementions on Beowulf Workstations
F fsocket.pdf Steven H.Rodrigues High-Performance Local Area Communication With Fast Sockets
F gath1_81.ps Susumu Shibusawa Scatter and Gather Operations on an Asynchronous Communication Model
F Gibbons94-contention.pdf Morgan-Kaufmann J.H. Reif, editor. A Synthesis of Parallel Algorithms
F giga_eth.pdf Howard Frazier Gigabit Ethernet : from 100 to 1000 Mbps
F giga_sw.pdf Kenneth Christensen Comparison of the Gigabit Ethernet Full-Duplex Repeater,CSMA/CD, and 1000/100-Mbps Switched Ethernet
F GIT-CC-94-28.ps Amarnath Mukherjee A Proof of Quasi-Independence Of Sliding Window Flow Control and Go-Back-N Error Recovery under Independent Packet Errors
F glunix98.ps Douglas Ghormley GLUnix: a Global Layer Unix for a Network of Workstations
F grille_approx.ps Pierre Fraigniaud Comparison of heuristics for one-to-all and all-to-all communications in partial meshes
F Gupta-inputQsw.pdf Pankaj Gupta Scheduling in Input Queued Switches: A Survey
F ibmsp2.pdf Gheith Abandah Modeling the Communication Performance of the IBM SP2
F icdcs99-harmony.pdf Peter Keleher Exposing Application Alternatives
F ICPADS97.pdf Xipeng Xiao An Overview of IP Switching and Tag Switching
F ics99-sumi.pdf Shinji Sumimoto The Design and Evaluation of High Performance Communication using a Gigabit Ethernet
F IEICE2001-faimess.pdf Go Hasegawa Survey on Fairness Issues in TCP Congestion Control Mechanisms
F inf_trans.ps Sung-Ho Choi Transient Analysis of Queueing Systems with Gradual Response and Applications
F info_theory_&_comm_net.pdf Anthony Ephremides Information Theory and Communication Networks : an Unconsummated Union
F inputoutput.TR.ps Melanie Fulgham A Comparision of Input and Output Driven Routers
F InterNet_review.pdf Howard Jay Siegel Inside Parallel Computers: Trends in Interconnection Networks
F ipp95-bala.pdf Vasanth Bala CCL: A Portable and Tunable Collective Communication Library for Scalable Parallel Computers
F jitu_sigcomm98.pdf Jitendra Padhye Modeling TCP Throughput : A Simple Model and its Empirical Validation
F Ian_eval.pdf Andrew Rindos Performance Evaluation of The Latest High Speed Lan Adapters : 100Mbps TOKEN RING; Gbps ETHERNET
F lectures_11-12.ps
The Discrete Fourier Transform ( DFT )
F Ii01tnpf.pdf
TruCluster Software Highly Available and Scalabe Solutions on Tru64 UNIX AlphaServer Systems
F m_sidi_delay_94.ps Israel Cidon On Queueing Delays of Dispersed Messages
F m_sidi-ebb-93.ps Opher Yaron Performance and Stability of Communication Networks via Robust Exponential Bounds
F m_sidi_encyclopedia.ps Moshe Sidi Single Server Queueing Models for Communication Systems
F memhie_clust.pdf Xing Du The Impact of Memory Hierarchies on Cluster Computing
F model_ccr97.pdf Mattew Mathis The Macroscopic Behavior of the TCP Congestion Avoidance Algorithm
F mwpp99.ps Giovanni Chiola Porting MPICH ADI on GAMMA with Flow Control
F myricom.pdf Bob Felderman The Next Generation: GM,LANai7 and 64-bit PCI
F net_switch.pdf Martin Herbordt Design Trade-Offs of Low-Cost Multicomputer Network Switches
F opt_ata_torus.pdf Yu-Chee Tseng Bandwidth-Optimal Complete Exchange on Wormhole-Routed 2D/3D Torus Networks: A Diagonal-Propagation Approach
F opt_star.pdf Paraskevi Fragopoulou Optimal Communication Algorithms on the Star Interconnection Network
F opt_TE_hcube.pdf D.Delesalle Optimal Total Exchange on an SIMD Distributed- Memory Hypercube
F opt_TE-ring.pdf Vassilios Dimakopoulos Optimal Total Exchange in Linear Arrays and Rings
F ospf.pdf
Open Shortest Path First ( OSPF)
F p100-kasbekar.pdf Mangesh Kasbekar pSNOW: A Tool to Evaluate Architectural Issues for NOW Environment
F p1014-press.pdf Edgar H. Sibley Larry Press Benchmarks for LAN Performance Evaluation
F p109-larsen.pdf Kai R.T. Karsen A Cost and Performance Model for Web service Investment
F p10-li.pdf San-Qi Li Link Capacity Allocation and Network Control by Filtered Input Rate in High-Speed Networkd
F p119-mainwaring.pdf Alan Mainwarning Design Challenges of Virtual Networks: Fast, General-Purpose Communication
F p11-gerwig.pdf Kate Gerwig Quality Of Service: Getting its priorities straight
F p123-skillicorn.pdf David Skillicorn Models and Languages for Parallel Computation
F p133_zhang.pdf Lixia Zhang Observations on the Dynamics of a Congestion Control Algorithm : The Effects of Two-Way Traffic
F p1467-ahituv.pdf Niv Ahituv A Model for Predicting and evaluating computer resource consumption
F p167-owicki.pdf Susan Owicki Factors in the performance of the AN1 Computer Network
F p172-anglano.pdf Cosimo Anglano Predicting Parallel Applications Performance on Non-dedicated Cluster Platforms
F p193-puente.pdf V. Puente Low-level Router Design and its Impact on Supercomputer System Performance
F p207-fahringer.pdf Thomas Fahringer A Static Parameter based Performance Prediction Tool for Parallel Programs
F p211-helman.pdf David Helman Parallel Algorithms for Personalized Communication and Sorting with an Experimental Study (Extended Abstract)
F p21-liao.pdf Cheng Liao Performance Monitoring in a Myrinet-Connected Shrimp Cluster
F p221-abeysundara.pdf Bandula Abeysundara High-Speed Local Area Networks and Their Performance : A Survey
F p228-buy.pdf Ugo Buy Analysis of Real-Time Programs with Simple Time Petri Nets
F p232-aron.pdf Mohit Aron Soft timers: efficient microsecond software timer support for network processing
F p251-panconesi.pdf Alessandro Panconesi Fast Randomized Algorithms for Distributed Edge Coloring ( Extended Abstract )
F p262-bjorkman.pdf Mats Bjorkman Performance Modeling of Multiprocessor Implementations of Protocols
F p263-norman.pdf Michael Norman Models of Machines and Computation for Mapping in Multicomputers
F p266-greenberg.pdf Albert Greenberg Computational Techniques for Accurate Performance Evaluation of Multirate, Multihop Communication Networks
F p271-juurlink.pdf Ben Juurlink A Quantitative Comparison of Parallel Computation Models
F p303-van_gemund.pdf Arjan J.C. van Gemund Compiling Performance Models from Parallel Programs
F p308-chen.pdf Peter M. Chen A New Approach to I/O Performance Evaluation-- Self-Scaling I/O Benchmarks, Predicted I/O Performance
F p310-hinrichs.pdf Susan Hinrichs An Architecture for Optimal All-to-All Personalized Communication
F p314-jacobson.pdf Van Jacobson Congestion Avoidance and Control
F p344-saavendra,pdf Rafael Saavedra Analysis of Benchmark Characteristics and Benchmark Performance Prediction
F p369-andrews.pdf Matthew Andrews Stability Results for Networks with Input and Output Blocking
F p474-harwood.pdf Aaron Harwood A Method of Trading Diameter for Reduced Degree to Construct Low Cost Interconnection Networks
F p488-kaushik.pdf S.D. Kaushik An Algebraic Theory for Modeling Direct Interconnection Networks
F p533-luciani.pdf James Luciani An Analytical Model for Partially Blocking Finite-Buffered Switching Networks
F p574-nicol.pdf David M. Nicol Parallel Simulation of Timed Petri-nets
F p581-panconesi.pdf Alessandro Panconesi Improved Distributed Algorithms for Coloring and Network Decomposition Problems
F p583-alon.pdf N. Alon Routing permutations on graphs via matchings (extended abstract)
F p593-choy.pdf Manhoi Choy Efficient Fault Tolerant Algorithms for Resource Allocation in Distributed Systems
F p60-ghezzi.pdf Carlo Ghezzi A General Way to put time in petri nets
F p610-grunwald.pdf Dirk Grunwald Networks for Parallel Processors: Measurements and Prognostications
F p629-grossglauser.pdf Matthias Grossglauser On the Relevance of Long-Range Dependence in Network Traffic
F p82-liu.pdf Zhen Liu Burst Reduction Properties of Rate-Control Throttles: Downstream Queue Behavior
F p832-d_ambrosio.pdf Matteo D'Ambrosio Evaluating the Limit Behavior of the ATM Traffic Within a Network
F p83-barve.pdf Rakesh Barve Modeling and optimizing I/O throughput of multiple disks on a bus
F p94-adler.pdf Micah Adler Modeling Parallel Bandwidth : Local vs.Global Restrictions
F p94-lee.pdf Corinna Lee A Study of Partitioned Vector Register Files
F pp97-jurczyk.pdf Michael Jurczyk Performance and Implementation Aspects of higher Order head-of-Line Blocking Switch Boxes
F p264-szymanski.pdf Ted Szymanski Design and analysis of buffered crossbar and banyans with cut-through switching
F pp89-kobza.pdf John Kobza A head-of-line approximation to delay-dependent scheduling in Integrated Packet-Swiched networks
F pp91-hosseini.pdf S.H. hosseini Distributed Global State Determination via Graph Coloring
F pp96-mckeown.pdf Nick Mckeown Achieving 100% throughput in an Input-Queued Switch
F pp97-genong.pdf Ge Nong A performance model for ATM Switches with Multiple Input Queues
F p188-mckeown.pdf Nick Mckeown The iSLIP Scheduling algorithm for input-queued switches
F pp97-sharma.pdf Neeraj Sharma Comparison of windowing policies for input buffered packet switch
F pp99-schoenen.pdf Rainer Schoenen Prioritized arbitration for input-queued switches with 100% throughput
F pdpta99.ps Giovanni Chiola GAMMA on Dec 2114x with Efficient Flow Control
F perfwpfinal.pdf Rolf McClellan Evaluating expected network performance based on multilayer switch performance data
F pp2000-sun.pdf Yuzhong Sun Recursive cube of rings: a new topology for interconnection networks
F pp2000-urban.pdf Peter Urban Contention-aware metrics for distributed algorithms: comparsion of atomic broadcast algorithms
F pp2000-yang.pdf Yuanyuan Yang Optimal all-to-all personalized exchange in self -routable multistage networks
F pp88-cidon.pdf Israel Cidon Real-time packet switching : a performance analysis
F pp88-hluchyi.pdf Michael Hluchyj Queueing in high-performance packet switching
F pp88-leung.pdf C.H.C.Leung ,Y.Kikumoto The throughput efficiency of the Go-Back-N ARQ scheme under Markov and related error structures
F pp89-choi.pdf Jun-Kyun Choi On acknowledgment schemes of sliding window flow control
F pp89-fendick.pdk Kerry W. Fendick Dependence in packet queues
F pp89-hosseini.pdf S.H. Hosseini System theory modeling and performance analysis of a distributed load balancing algorithm
F pp90-monterio.pdf Jose A. Suruagy Monteiro Leaky bucket analysis for ATM networks
F pp91-Agarwal.pdf Anant Agarwal Limits on Interconnection network performance
F pp91-cruz1.pdf Rene L. Cruz A calculus for network delay, Part 1: network elements in isolation
F pp91-cruz2.pdf Rene L. Cruz A calculus for network delay, Part 2: network analysis
F pp91-ko.pdf Keng-Tai Ko Optimal end-to-end sliding window flow control in high-speed networks
F pp91-scott.pdf David S.Scott Efficient all-to-all communication patterns in hypercube and mesh topologies
F pp91-zhang.pdf Xiaodong Zhang Performance prediction and evaluation of parallel processing on a NUMA Multiprocessor
F pp92-kuang.pdf Lei Kuang Monotonicity properties of the leaky bucket
F pp93a-cidon.pdf Israel Cidon Analysis of a Correlated queue in a communication system
F pp93-agrawala.pdf Ashok K Agrawala Deterministic Model and transient analysis of virtual circuits
F pp93b-cidon.pdf Isral Cidon On queues with inter-arrival times proportional to service times
F pp93-bianchi.pdf Giuseppe Bianchi Improved queueing analysis of shared buffer switching networks
F pp93c-cidon.pdf Israel Cidon Analysis of message delay processes
F pp-93-lee.pdf J.Y.Lee , C.K. Un performance of dynamic rate leaky bucket algorithm
F pp93-li.pdf San-Qi Li Fundamental limits of input rate control in high speed network
F pp93-lin.pdf Xiaola Lin Multicast communication in multicomputer networks
F pp93-naghshineh.pdf Mahmoud Naghshineh fixed versus variable packet sizes in fast packet-switched networks
F pp93-wong.pdf Michael K.Wong A deterministic fluid model for cell loss in ATM networks
F pp93-yaron.pdf Opher Yaron Performance and Stability of Communication Networks via Robust Exponential Bounds
F pp93-yoshimoto.pdf Masakazu Yoshimoto Waiting time and queue length distributions for go-back-N and selective-repeat ARQ protocols
F pp94-agarwal.pdf R.C.Agarwal A high performance parallel Algorithm for 1-D FFT
F pp94-hambrusch.pdf Susanne E. Hambrusch C3: An architecture-independent model for coarse-grained parallel machines
F pp94-heddayo.pdf Abdelsalam Heddaya Using warp to control network contention in Mermera
F pp94-mckinley.pdf Philip K. McKinley Unicast-Based Multicast communication in wormhole-routed networks
F pp94-Notorogan.pdf Chitra Natarajan Measurement-based characterization of global memory and network contention, operating system and parallelization overheads: case study on a shared-memory multiprocessor
F pp94-strumpen.pdf Volker Strumpen Exploiting communication latency hiding for parallel network computung : model and analysis
F pp94-valerio.pdf M. Valerio Recursively scalable fat-trees as interconnection networks
F pp94-zu.pdf Hong Xu Optimal software multicast in wormhole-routed multicast networks
F pp95_qiang.pdf Qiang Li Fat-tree for local area multiprocessors
F pp95-bader.ps David A. Bader Practical parallel algorithms for personalized communucation and integer sorting
F pp95-dimpsey.pdf Robert T. Dimpsey A measurement-based model to predict the performance impact of system modifications: A Case Study
F pp95-durger.pdf Douglas C. Burger Accuracy vs. Performance in parallel simulation of interconnection networks
F pp95-eicken.pdf Thorsten Von Eicken Low-latency communication over ATM networkd using active messages
F pp95-lowe.pdf lowe data communication
F pp95-ohring.pdf Sabine R Ohring On generalized fat trees
F pp95-yao.pdf Yu-Dong Yao An effective go-back-N ARQ scheme for variable-error-rate channels
F pp96-chen.pdf Biao Chen Meeting delay requirements in computer networks with wormhole routing
F pp96-culler.pdf David E. Culler Assessing fast network interfaces
F pp96-leonardi.pdf Emilio Leonardi congestion control in asynchronous, high-speed wormhole routing networks
F pp96-molle-100BaseT,pdf Mart Molle 100Base-T / IEEE 802.12/Packet Switching
F pp96-ni.pdf Lionel M. Ni issues in designing truly scalable interconnection networks
F pp96-park.pdf Ju-Young L. Park construction of optimal multicast trees based on the parameterized communication model
F pp96-sun.pdf Xian-He Sun Performance prediction A case study using a scalable shared -virtual-memory machine
F pp96-tseng.pdf Yu-Chee Tseng all-to-all personalized communication in a wormhole-routed torus
F pp96-zheng.pdf S.Q.Zheng Dual of a complete graph as an interconnection network
F pp97-bokhari.pdf Shahid H. Bokhari performance evaluation Balancing contention and synchronization on the Intel Paragon
F pp97-donglai.pdf Donglai Dai How much does network contention affect distributed shared memory performance ?
F pp97-galles.pdf Mike Galles Spider : A high-speed network interconnect
F pp97-hall.pdf James Hall Counting the cycles : a comparative study of NFS performance over high speed networks
F pp97-jain.pdf Ravi Jain Heuristics for scheduling I/O Operations
F pp97-joshi.pdf Bharat S. Joshi On a Load Balancing alforithm based on edge coloring
F pp97-laforest.pdf Christian Laforest Edge disjoint graph spanners of complete graphs and complete digraphs extended abstract
F pp97-lakshman.pdf Y.V. Lakshman The performance of TCP/IP for networks with high bandwidth-delay products and random loss
F pp97-nupairoj.pdf Natawut Nupairoj Architecture-Dependent tuning of the parameterized communication model for optimal multicasting
F pp97-petrini.pdf Fabrizio Petrini Network performance under physical constraints
F pp98-chien.pdf Andrew A. chien Design challenges for high-performance network interfaces
F pp98-guo.pdf Minyi Guo Contention-free communication scheduling for array redistribution
F pp98-kirovski.pdf Darko Kirovski Efficient coloring of a large spectrum of graphs
F pp98-kumar.pdf Girish Kumar efficient algorithms for delay-bounded minimum cost path problem in communication networks
F pp98-markaki.pdf M. Markaki An adaptive genetic algorithm for channel sharing in high speed network
F pp98-mishra.pdf Shivakant Mishra An evaluation of flow control in group communication
F pp98-sethu.pdf Harish Sethu IBM RS/6000 SP interconnection network topologies for large systems
F pp99-feuser.pdf Oliver Feuser On the effects of the IEEE 802.3x flow control in full-duplex ethernet LANs
F pp99-fineberg.pdf Sammuel Fineberg Analysis of 100Mb/s Ethernet for the Whitney Commodity computing testbed
F pp99-floyd.pdf Sally Floyd Promoting the Use of End-to-End Congestion Control in the Internet
F pp99-fujita.pdf Satoshi Fujita A fault-tolerant broadcast scheme in the star graph under the single-port, half-duplex
F pp99-pandey.pdf Aparna Pandey Quality of service support over switched ethernet
F pp99-qiu.pdf Lili Qiu On individual and aggregate TCP performance
F pp99-riezenman.pdf
technology 1999 analysis & forecast Communication
F pp99-sharma.pdf Supriya Sharma optimal buffer management policies for shared-buffer ATM switches
F ppam97prof.ps.gz Jacck Kitowski Review of parallel computer architectures
F queueing_systems.pdf Santosh Venkatesh Delay models in the network layer
F QueuingAnalysis.pdf William Stallings Queuing Analysis
F R.pdf Randolph Wang Towards a theory of optimal communication pipelines
F route.ps Yoav Etsion Topology and routing in clusters : from theory to practice
F router_compare.pdf J. Duato A comparision of router architectures for virtual cut-through and wormhole switching in a NOW environment
F saleh-hot-1.pdf M. Saleh An accurate performance model of shared buffer ATM switches under hot spot traffic
F sc98USC.ps Soichiro Araki User-Space communication : A Quantitative Study
F sccs-0710.pdf Gang Cheng The high performance switch and programming interfaces on IBM SP2(draft)
F sci2000.pdf M. Aboelaze Performance of a Switched Ethernet: A Case Study
F sicon95.ps Emmanouel Varvarigos Loss-free communication in High-Speed Networks
F sigcomm2000-8-3.ps Kevin Lai Measuring Link Bandwidths using a deterministic model of packet delay
F smart.ps S. Keshaw Smart retransmission : performance with overload and random losses
F sync_iPSC.pdf Steven Seidel A global synchronization algorithm for the intel iPSC/860
F tcp-evaluation.pdf Mark Allman On the effective Evaluation of TCP
F TE_theory.pdf Vassilios Dimakopoulos A theory for total exchange in multidimensional interconnection networks
F traff.ps David Jagerman Stochastic modeling of traffic processses
F transient.ps Bong Dae Choi Transient analysis of a Markovian Arrival queue with congestion control based on Thresholds
F Tron94-model-contention.pdf Cecile Tron Modelling of communication contention in Multiprocessors
F van-sally.pdf Sally Floyd Traffic Phase effects in packet-switched gateways
F Yate93_Per-session.ps David Yates On per-session end-to-end delay distributions and the call admission problem for real-time applications with QOS requirements
F zaim-final.ps Abdelsalam Heddaya Congestion control for asynchronous parallel computing on workstation networks
F pp95-elsaadany.pdf Amr Elsaadany Performance study of buffering within switches in local area networks
F pp95-shi.pdf Hong Shi Buffer size trade-offs in input/output-buffered atm switches under various conditions
F pp95-wang.pdf Jonathan L. Wang Impact of self-similarity on the Go-Back-N ARQ protocol
F infocom.pdf Alhussein A. Abouzeid Stochastic modeling of TCP over lossy links
F 04a_02.pdf Xiaowei Yang A model for window based flow control in packet-switched networks
F Compendium.pdf Regress+ A Compendium of Common Probability Distributions
F Yajn99_Meas.ps.Z Maya Yajnik Measurement and modelling of the temporal dependence in packet loss
F pp91-lee.pdf T.H. Lee The throughput efficiency of the Go-Back-N ARQ scheme for burst-error channel
F pp97-zepernick.pdf H.J. Zepernick Reliability and throughput analysis of ARQ schemes in burst error channels
F sc00-feng.pdf W. Feng The Failure of TCP in High-Performance Computational Grids
F sc00-hsieh.pdf Jenwei Hsieh Architectural and Performance Evaluation of GigaNet and Myrinet Interconnects on Clusters of Small-Scale SMP Servers
F hotinter95-tcp.ps Kimberly K. Keeton LogP quantified: the case for low-overhead local area networks
F pp89-clark.pdf David Clark An analysis of TCP processing overhead
F Via_Java_sock.ps Avneesh Pant An efficienct implementation of Java stream sockets on VIA
F hpdc-vmi.pdf Avneesh Pant VMI: An efficient messaging library for heterogeneous cluster communication
F pad1.pdf Volnys Borges PAD cluster: An open, modular and low cost high performance computing system
F infocomm-98.pdf Anindya Basu Promela++: A language for constructing correct and efficient protocols
F 313.ps.gz J. Bolliger The effectiveness of end-to-end congestion control mechanisms
F final-paper.pdf Mark Baker Cluster computing white paper
F report339.pdf Christian Kurmann Improving the Network Interfaces for Gigabit Ethernet in Clusters of PCs by Protocol Speculation
F tcp_802_3x.ps.gz J. Wechta The Interaction of the TCP Flow Control Procedure in End Nodes on the Proposed Flow Control Mechanism for Use in IEEE 802.3 Switches
F selective-back-pressure-in.pdf W. Noureddine Selective Back-Pressure In Switched Ethernet Lans
F durham98.zip J. Wechta Simulation-based Analysis of the Interaction of End-to-End and Hop-by-Hop Flow Control Schemes in Packet Switching LANs
F man.zip J. Wechta An Investigation into the Performance of Switched LANs
F hoti2000final.pdf Wemer Vogels Tree-staturation control in the AC3 velocity cluster interconnect
F 802_3x_vendor.ps Network World Flow control feedback from varous vendors
F 121.PDF V. Shurbanov A queueing model for space-division packets switches and its application to the performance evaluation of computer networks
F 152.PDF Franck Cappello HiHCoHP - Towards a realistic communication model for hierarchical hyperclusters of heterogeneous processors
F 159.PDF Cristina Boeres On the design of clustering-based scheduling algorithms for realistic machine models
F 164.PDF Chamath Keppitiyagama Asynchronous MPI messaging on Myrinet
F 171.PDF Yuanyuan Yang Near-Optimal All-to-all broadcast in multidimensional all-port meshes and tori
F 176.PDF Yoav Etsion User-level communication in a system with gang scheduling
F CAC_04.PDF G. Ciaccio Messaging on Gigabit Ethernet: Some experiments with GAMMA and other systems
F CAC_07.PDF Xavier Molero On the interconnection topology for storage area networks
F CAC-11.PDF Nan Ni Fair scheduling for input buffered switches
F CAC-12.PDF Patricia Gilfeather Fragmentation and high performance IP
F pp99-logpQ.pdf Takayoshi Touyama Performance Evaluation of Practical Parallel Computation Model LogPQ
F pp99-QSM.pdf Brian Grayson Experimental Evaluation of QSM, a Simple Shared-Memory Model
F Chochia96_model.pdf G. Chochia Analysis of Multicomputer Schedules in Cost and Latency Model of Communications
F Wu97-perf_model.pdf Xingfu Wu Performance models for scalable cluster computing
F Hipper97-cluster.pdf G. Hipper Advanced workstation cluster architectures for parallel computing
F com.ps.gz Ammon Barak Performance of the Communication Layers of TCP/IP with Myrinet Gigabit LAN
F prylli.pdf Loic Prylli BIP: a new protocol designed for high performance networking on Myrinet
F PE01-Federica.pdf Federica Aquilani Performance analysis at the software architectural design level
F PE01-Ajmone_Marsan.pdf M. Ajmone Marsan Accurate approximate analysis of cell-based switch architectures
F bsp_logp.ps Gianfranco Bilardi BSP vs LogP
F qsmemul.ps Vijaya Ramachandran Emulations between QSM, BSP and LogP: A framework for general-purpose parallel algorithm design
F p103-valiant.pdf L. Valiant A Bridging Model for Parallel Computation
F pp96-Xu.pdf Z. Xu MPPs and Clusters for Scalable Computing
F Ni97-sw.ps.gz Lionel M. Ni Switches and Switch interconnects
F beowulf_paper.pdf R.J. Allan High Performance Computing and Beowulf Clusters
F asplos2000.pdf Raoul Bhoedjang Evaluating Design Alternatives for Reliable Communication on High-Speed Networks
F pp95-Moldeklev.pdf K. Moldeklev How a large ATM MTU causes deadlocks in TCP data transfers

Sheet 2: parallel

P 164pcds.pdf

P 1997.HPCA97.user_level_dma.ps Evangelos p.markatos user-level DMA without operating system kernel modification
P 21164ds.pdf

P 21164pb.pdf

P 21264pb.pdf
P 70100079.ps umesh maheshwari collecting cyclic distributed garbage by controlled migration
P 93_tr479_the_search_for_lost_cycles.ps mark e. crovella the search for lost cycles: a new approach to parallel program performance evaluation
P 93-547.ps rafael h.saavedra characterizing the performance space of shared memory computers using micro-benchmarks
P 93-spaa.ps richard m.karp optimal broadcast and summation in the LogP Model
P 95-61.pdf shahid h.bokhari multiphase complete exchange on paragon,SP2 & CS-2
P 96_17.ps J.M. Nash scalable and portable computing using the WPRAM Model1
P 96001.pdf j simon on accurate performance prediction for massively parallel systems and its applications
P 96005.pdf jens simon performance prediction of benchmark programs for massively parallel architectures
P 96-73.ps david cronk thread migration in the presence of pointers
P 96-SCIZZL.PS maximilian ibel implementing active messages and split-C for SCI clusters and some architectural implications
P 98003.pdf j.simon the latency-of-data-access model for analysing parallel computation
P ace-intro.ps douglas c.schmidt the adaptive communication environment an object-oriented network programming toolkit for
P

developing communication software
P ace-ipcl.ps douglas c.schmidt IPC SAP a family of object-oriented interfaces for local and remote interprocess communication
P ace-jaws.ps james c.hu JAWS: a framework for high-performance web servers
P aeneas.ps herbert w.hamber AENEAS a custom-built parallel supercomputer for quantum gravity
P alcover_PDP96 r.alcover interconnection network design : a statistical analysis of interactions between factors
P alltoall_flood.pdf

P alltoall_irregular.pdf wenheng liu portable and scalable algorithms for irregular all-to-all communication
P alltoall_kport.pdf ming-syan chen optimal all-to-all broadcasting schemes in distributed systems
P alltoall_mesh_paragon.pdf shahid h.bokhari balancing contention and synchronization on the intel paragon
P alltoall_multi_phase.pdf shahid h.bokhari multiphase complete exchange on paragon,SP2 & CS-2
P alltoall_now.pdf matt jacunski all-to-all broadcast on switch-based clusters if workstations
P alltoall_review.pdf ming-syan chen on general results for all-to-all broadcast
P am-spec-2_0.ps alan mainwaring active messge applications programming interface and communication subsystem organization
P analysis.ps peter druschel network subsystem design: a case for an integrated data path
P arcs.ps giovanni chiola architectural issues and preliminary benchmarking of a low-cost network of workstations based on
P

active messge
P Avalanche Message Passing.doc

P bcast-async.ps amotz bar-noy designing broadcasting algorithms in the postal model for message-passing systems
P bcast-kport.ps amotz bar-noy broadcasting multiple messages in the multiport model
P BDM.ps david a. bader practical parallel algorithms for dynamic data redistribution, median finding, and selection
P

(preliminary draft )
P bench_faq.txt

P bench_pro8.pdf

P benchm_muticast_comm.ps natawut nupairoj benchmarking of multicast communication services
P benchmark.ps stephen j.von worley microbenchmarking and performance prediction for parallel computers
P benchmarkxx.ps brian n.bershad using microbenchmarks to evaluate system performance
P bilas-sc97.ps angelos bilas the effects of communication parameters on end performance of shared virtual memory clusters
P bip-manual.ps loic prylli BIP messages user manual for BIP 0.94
P bml.ps rafael h.saavedra analysis of benchmark characteristics and benchmark performance prediction
P boden_micro95.pdf nanette j. boden myrinet: a gigabit-per-second local area network
P the bridging_ model_ gap.html bruce m.martin the bridging model gap: what are bridging models missing?
P cachekernel_ps david r.cheriton a caching model of operating system kernel functionality
P cacm.ps peter druschel operating system support for high-speed communication
P CAMPaS1.ps y. tanaka COMPaS: a pentium pro PC-based SMP cluster and its experience
P cappello97.ps peter cappello Javelin : internet-based parallel computing using java
P ccc97.ps j.m. graham models,paradigms and parallel languages : what else do we need ?
P cc-exp_ps massimo bernaschi collective communication operations: experimental results vs.theory
P challenge_paper.ps mike galles performance optimizations,implementation, and verification of the SGI challenge multiprocessor
P cheating.ps durrell anderson cheating the I/O bottleneck: network storage with trapeze/myrinet
P chiola.ps giovanni chiola GAMMA on dec 2114x with efficient flow control
P choi.ps sung-eun choi quantifying the effects of communication optimizations
P cierniak97.ps michal cierniak just-in-time optimizations for high-performance java programs
P ckpt97.ps michael litzkow checkpoint and migration of UNIX processes in the condor distributed processing system
P clumps.ps steven s.lumetta multi-protocol active messages on a cluster of SMP's(to appear in the proceedings of SC97)
P cluster.htm

P cluster.ps stephen donaldson BSP clusters : high performance,reliable and very low cost
P CMU-1.ps jose carlos brustoloni user-level protocol serves with kernel-level performance
P CMU-2.ps jose carlos brustoloni scaling of end-to-end latency with network transmission rate
P CMU-3.ps jose carlos brustoloni evaluation of data passing and scheduling avoidance
P comp9812-chien.pdf andrew a.chien design challenges for high-performance network interfaces
P comp9812-lee.pdf whay sing lee an efficient,protected message interface
P comp9812-user.pdf raoul a.f. bhoedjang user-level network interface protocols
P comp9812-VIA.pdf thorsten von eicken evolution of the virtual interface architecture
P COMPaS-report.ps y. tanaka COMPaS: a pentium pro PC-based SMP cluster and its experience
P complete_exchange.pdf shahid h.bokhari multiphase complete exchange : a theoretical analysis
P concepts2.ps

P cong_cont.pdf moshe sidi congestion control through input rate regulation
P const_multicast_tree_comm_model.ps ju young l. park construction of optimal multicast trees based on the parameterized communication model
P cost_model_comm_SMP.ps nancy M. amato a cost model for communication on a symmetric multiprocessor
P cpceng.zip

P dbs_paper.ps yukio murayama DBS: a powerful tool for TCP performance evaluations
P dcomspec.txt

P dcs-tr-362.ps.gz b r badrinath gathercast: an efficient multi-point to point aggregation mechanism in IP networks
P Dculler.zip

P disco-tocs.ps edouard bugnion disco: running commodity operating systems on scalable multiprocessors
P DISI-TR-96-12.ps g chiola operating system support for fast communications in a network of workstations
P dist_obj_with_CORBA.ps steve vinoski distributed object computing with CORBA
P donaldhillskill_varscal.ps stephen r donaldson communication performance optimisation requires minimising variance
P dp_paper.ps chun ming lee directed point : an efficient communication subsystem for cluster computing
P dsm.ps chris holt the effects of latency,occupancy,and bandwidth in distributed shared memory multiprocessors
P dugki_pps94.pdf dugki min a multipath contention model for analyzing job interactions in 2-D mesh multicomputers
P dxbsp-spaa95.ps guy e blelloch accounting for memory bank contention and delay in high-bandwidth multiprocessors
P E10000.ps alan charlesworth gigaplane-XB: extending the ultra enterprise family
P eisen_APDC97.pdf jorn eisenbiegler on the optimization by redundancy using an extended LogP Model
P Elsaad_CC96.pdf amr elsaadany performance evaluation of switching in local area networks
P europar.pdf matt welsh low-latency communication over fast ethernet
P exokernel.ps dawson r engler exokernel : an operating system architecture for application-level resource management
P fast_collective_comm_lib.ps prasenjit mitra fast collective communication libraries,please
P fci.ps franck cappello performance evaluation of two programming models for a cluster of PC biprocessors
P firmw_reli_comm_SAN.psa angelos bilas firmware support for reliable communication and dynamic system configuration in system
P

area networks
P FIT-TR-97-07.ps ashley beitz a migration-friendly tasking environment for gardens
P fm.pdf scott pakin fast messages : efficient,portable communication for workstation clusters and MPPs
P FM_sc97bof.ps

P FM-II_spec.doc

P fm-pdt.ps scott pakin fast messages(FM):efficient,portable communication for workstation clusters and massively-
P

parallel processors
P focs94.ps robert d blumofe scheduling multithreaded computations by work stealing
P foolBM.ps david h. bailey twelve ways to fool the masses when giving performance results on parallel computers
P frontiers96.ps koen langendoen integrating polling,interrupts,and thread management
P gathering.ps sandeep n. bhatt scattering and gathering messages in networks of processors
P gdcast.ps amotz bar noy multicasting in heterogeneous networks
P geist96.ps g a geist PVM and MPI : a comparison of features
P gigabit_lan.pdf david g cunningham IEEE802.12 GIGABIT LAN
P global_mem_manage.ps michael j feeley implementing global memory management in a workstation cluster
P gms_96asplos.ps herve a jamrozik reducing network latency using subpages in a global memory environment
P Golin_TCS97.pdf mordccai golin optimal point-to-point broadcast algorithms via lopsided trees
P gossip-kport.ps a bar noy computing global combine operations in the multi-port postal model
P Harz_Sevcik_SC93.ps karim harzallah hot spot analysis in large scale shared memory multiprocessors
P HillCrumptonBurgess_europar96.ps jonathan m d hill theory,practice,and a tool for BSP performance prediction
P hinet95.ps hong xu improving PVM performance using ATOMIC user-level protocol
P hori-ccc97.ps atsushi hori an implementation of parallel operating system for clustered commodity computers
P hoti4_submitted.pdf richard gillett experience using the first-generation memory channel for PCI network
P hoti97.ps brent n chun virtual network transport protocols for myrinet
P hotos95.ps allen b montz scout: a communications -oriented operating system
P hp_8way.pdf
eight-way multiprocessing
P hpc_bsp.html
high performance computing archive bulk synchronous parallel model(BSP)subject area
P hpca97.pdf matt welsh ATM and fast ethernet network interfaces for user-level communication
P HPCA97.ps y tanaka a comparision of data-parallel collective communication performance and its application
P hpca98.ps remzi h. arpaci-dusseau the architectural costs of streaming I/O: a comparisonof workstations,clusters,and SMPs
P hpca98_impact.ps shubhendu s. mukherjee the impact of data transfer and buffering alternatives on network interface design
P hpca98_nitrans.ps ioannis schoinas address translation mechanisms in network interfaces
P hpdc7-lauria.ps mario lauria efficient layering for high speed communication : fast messages 2.X
P hpdc97_final.ps silvia m. figueira predicting slowdown for networked workstations
P HPVM4.ps mario lauria experiences on porting MPICH on FM and Myrinet
P hpvm-siam97.ps andrew chien high performance virtual machines (HPVM): clusters with supercomputing APIs and performance
P Ianne_PDS97.pdf giulio lannello efficient algorithms for the reduce-scatter operation in LogGP
P ibel.ps maximilian ibel high-performance cluster computing using SCI
P ibm_aix.pdf
IBM AIX version 4
P icnp98.ps.gz

P icpp95-collective.ps dhabaleswar k. panda issues in designing efficient and practical algorithms for collective communication on wormhole-
P

routed systems
P ics98.ps francis o'carroll the design and implementation of zero copy MPI using commodity hardware with a high
P

performance network
P ilpapplic_ps.gz bengt ahlgren the applicability of integrated layer processing
P ilpmodel.ps bengt ahlgren a performance model for integrated layer processing
P input_buff.pdf andreas kitstadter fairness and performance limits of contention resoluting mechanisms for input buffered switches
P ipps.ps ad pimentel an architecture workbench for multicomputers
P IPPS97.ps maged m. michael relative performance of preemption-safe locking and non-blocking synchronization on
P

multiprogrammed shared memory multiprocessors
P ipps97ULC.ps stefanos n. damianakis reducing waiting costs in user-level communication
P ISCA23.ps olivier maquelin polling watchdog: combining polling and interrupts for efficient message handling
P isca92.ps thorsten von eicken active messages : a mechanism for integrated communication and computation
P isca95-modelling-memory-performance.ps t. stricker optimizing memory system performance for communication in parallel computers
P isca97.ps richard p. martin effects of communication latency,overhead,and bandwidth in a cluster architecture
P iwcc99model.ps anthony tam realistic communication model for parallel computing on cluster
P jaime.ps jaime bae kim analysis of a finite buffer queue with heterogeneous markov modulated arrival processes : a study
P

of traffic burstiness and priorty packet discarding
P JaJa_PDS96.pdf joseph f jaja the block distributed memory model
P JaJja_PPS94.pdf joseph f jaja the block distributed memory model for shared memory multiprocessors (extended abstract )
P Japan.ps daniel a reed performance analysis of parallel systems : approaches and open problems
P jb.ps zheng wang analysis of burstiness and jitter in real-time communications
P jpdc97.ps chi chung lam optimal algorithms for all-to-all personalized communication on rings and two dimensional tori
P Klein_LT96.pdf leonard kleinrock the supercomputer supernet testbed : a WDM-based supercomputer interconnect
P Lahch_GLOBECOM96.pdf abdelhakim lahchime ATM switch architecture modelling under uniform and bursty traffic
P lam-mpi.ps greg burns LAM: an open cluster environment for MPI
P LANai_prog.ps anthony skjellum a guide to writing myrinet control programs for LANai 3.x
P LANai4_X_doc.txt

P latbdw.ps patrick h. worley a study of application sensitivity to variation in message passing latency and bandwidth
P Icpc96.ps pedro diniz lock coarsening : eliminating lock overhead in automatically parallelized object-based programs
P LIMBO.ps
the limbo programming language
P Imbench-usenix.ps larry mc voy imbench : portable tools for performance analysis
P LNPC.zip

P locking_US-letter.ps mats bjorkman locking effects in multiprocessor implementations of protocols
P logp.ps david culler LogP : towards a realistic model of parallel computation
P Logp_Micro.ps david culler LogP performance assessment of Fast network interfaces
P logpc.ps csaba andras moritz LoGPC: modeling network contention in message-passing programs
P lopc_97ppopp.pdf mattew I. Frank LoPC: modeling contention in parallel algorithms
P Lopez_PDP99.pdf p. lopez optimizing network throughput : optimal versus robust design
P matching.ps dimitrios stiliadis providing bandwidth guarantees in an input-buffered crossbar switch
P memo-387.ps boon s. ang message passing support on star T-voyager
P microbench.pdf stephen j con worley microbenchmarking and performance prediction for parallel computers
P misleadBM.ps david h . Bailey misleading performance reporting in the supercomputing field
P mmbmSPEC.ps allen b downey a model for speedup of parallel programs
P model-arch.ps mark j. clement architectural scaling and analytical performance prediction
P modeling_comm_in_par_alg.ps jaswinder pal singh modeling communication in parallel algorithms : a fruitful interaction between theory and systems
P model-PVM.ps.gz mark j clement network performance modeling for PVM clusters
P model-PVM2.ps.gz michael r. steed performance prediction of PVM programs
P model_of_parallelism.ps Todd Heywood Models of Parallelism
P model-scaling_ps.gz mark j clement using analytical performance prediction for architectural scaling
P mpcas_ps.gz dmitry arapov a parallel language for modular distributed programming
P mpchc_ps.gz dmitry arapov a progrmming environment for heterogenous distributed memory machines
P memo-387.ps dmitry arapov a programming environment for heterogenous distributed memory machines
P mpi_guide.ps peter s pacheco a user's guide to MPI
P mpi_t3d.ps kenneth cameron CRI/EPCC MPI for CRAY T3D
P mpi-ap1000.ps david sitsky implementation and performance of the MPI message passing interface on the fujitsu AP1000 multicomputer
P mpicharticle.ps william gropp a high-performance,portable implementation of the MPI message passing interface standard
P mpidc95_pap.ps vasilios georgitsis performance of MPL and MPICH on the SP2 system1
P mpi-pcw94.ps david sitsky an efficient implementation of the message passing interface (MPI) on the fujitsu AP1000
P mppm98.ps jonathan m d hill portability of performance with the BSPLib communications library
P msu-cps-acs-106.ps sherry q moore a effects of network contention on processor allocation strategies
P msu-cps-acs-106.ps sherry q moore the effects of network contention on processor allocation strategies
P myrinet-fm-sc95.ps scott pakin high performance messaging on workstations : illinois fast messages (FM) for myrinet
P NAHU94_PERFORMANCE.ps erich m nahum performance issues in parallelized network protocols
P NAHU97_CACHE.ps erich m nahum cache behavior of network protocols
P NAS-97-005.PDF andrew sohn communication studies of DMP and SMP machines
P NAS-97-017.PDF kevin t pedretti analysis of 2D torus and hub topologies of 100mb/s ethernet for the whitney commodity computing testbed 1
P NAS-97-023.PDF jeffrey c becker predicting cost/performance trade-offs for whitney: a commodity computing cluster
P NAS-97-024.PDF samuel a fineberg a scalable software architecture booting and configuring in the whitney commodity computing testbed 1
P NAS-97-025.ps samuel a fineberg analysis of 100mb/s ethernet for the whitney commodity computing testbed 1
P NAS-98-003.pdf jerry c yan performance data gathering and representation from fixed-size statistical data
P NAS-98-012.pdf abdul waheed performance modeling and measurement of parallelized code for distributed shared memory multiprocessors
P netperf.ps chris maeda networking performance for microkernels
P Nexuslava-vg.ps george thiruvathukal /ian foster java interfaces to high performance communication systems
P nfmdcs.ps patrick g sobalvarro dynamic coscheduling on workstation clusters
P NI_support_SVM_cluster.ps angelos bilas network interface support for shared virtual memory on clusters
P non-blocking-osdi.ps michael greenwald the synergy between non-blocking synchronization and operation system structure
P numa-os.ps john chapin memory system performance of UNIX on CC-NUMA mutiprocessors
P oam.ps deborah a wallach optimistic active messages: a mechanism for scheduling communication with computation
P optimal_bs_DistMem.ps ramesh subramonian optimal broadcast in a distributed memory model of parallel computation
P optimal_bs_logp.ps richard d karp optimal broadcast and summation in the LogP Model
P origin200-1.ps harvey wasserman performance evaluation of the SGI origin2000: a memory-centric characterization of LANL ASCI applications
P origin200-MenMod.ps olaf m lubeck developing and validation of a hierarchical memory model incorporating CPU- and memory-operation overlap
P origin200-slides.ps federico bassetti performance evaluation of the SGI origin2000: a memory-centric characterization of LANL ASCI applications
P

or single node performance : where? Oh where, has it gone ?
P osiris.ps peter druschel experiences with a high-speed network adaptor : a software perspective
P os-memorysys.ps j bradley chen the impact of operating system structure on memory system performance
P OSSurvey.ps anand r tripathi trends in multiprocessor and distributed operating system designs
P p2014.pdf steve chapin multiprocessor operating systems : harnessing the power
P p2016.pdf jorg cordsen vote for peace : implementation and performance of a parallel operating system
P p2028.pdf koen langendoen models for asynchronous message handling
P p259-kay.pdf jonathan kay the importance of non-data touching processing overheads in TCP/IP
P p298-bruck.pdf jehoshua bruck efficient algorithms for all-to-all communications in multi-port message-passing systems
P p31.ps db skillicorn m danelutto optimising data-parallel programs using the BSP cost model
P p584-benveniste.pdf caroline benveniste parallel simulation of the IBM SP2 interconnection network
P p7.ps wf McColl the BSP approach to architecture independent parallel programming
P p701.ps barry f smith an interface for efficient vector scatters and gathers on parallel machines
P pack_loss.pdf israel cidon analysis of packet loss processes in high-speed networks
P pack_sw.pdf ag waters fast packet switching : an overview
P paper_ubench_sc97.ps cristina hristea measuring memory hierarchy performance of cache-coherent multiprocessors using micro benchmarks 1
P PARKBENC.ps roger hockney public international benchmarks for parallel computers
P pdp97.ps giovanni chiola Gamma: a low-cost network of workstations based on active messages
P camera.ps y tanaka performance improvement by overlapping computation and communication on SMP clusters
P PE_pro_farm.pdf alan s wagner performance models for the processor farm paradigm
P PE_pro_wormN.pdf lionel m ni performance evaluation of switch-based wormhole networks
P pedroso97.ps hernani pedroso web-based metacomputing with JET
P perf_eval_mpi_clust.ps natawut nupairoj performance evaluation of some MPI implementations on workstation clusters
P perf_eval_sw_wormh.pdf lionel m ni performance evaluation of switch-based wormhole networks
P perfmodel.ps jurgen brchm performance modeling for SPMD message-passing programs
P philippsen_97.ps michael philippsen javaparty-transparent remote objects in java
P pipeline.ps randolph y wang modeling communication pipeline latency
P planet.pdf inder gopal network transparency:the plaNET approach
P pnode_cps.ps g lannello performance analysis of distributed memory computers with parallel node architecture
P proc_mig_iss.ps alberto zubiri design issues on process migration
P profile_myrinet.ps ilia gilderman profiling the communication layers performance of the myrinet gigabit LAN
P prop.ps rolf riesen using kernel extensions to decrease the latency of user-level communication primitives
P PUPA.ps manish verma pupa: a low-latency communication suystem for fast ethernet
P pupa_draft.ps manish verma alow latency communication subsystem (in preparation )
P questions&answersBSP.ps db skillicorn m danelutto questions and answers about BSP
P queueing_theory.ps georgios y lazarou continuous-time markov chains and queueing theory
P random_delay.ps vikram s adve the influence of random delays on parallel execution times
P REAL.ps eric grosse real inferno
P revguide.pdf
solaris 2.6 reviewer's guide
P Roda_PACT98.pdf roda j rodriguez c breaking the barriers : two models for MPI programming
P Roda_PDP99.pdf roda jl sande f the collective computing model
P RPMark95.pdf
RPMark tm 95
P SALE95_PERFORMANCE.ps james d salehi the performance impact of scheduling for cache affinity in parallel newtork processing
P sale96_Effectiveness-TON.ps james d salehi the effectiveness of affinity-based scheduling in multiprocessor networking ( extended version )
P SC94-paper.ps mike barnett building a high-performance collective communication library
P sc97.ps steven s.lumetta multi-protocol active messages on a cluster of SMP's(to appear in the proceedings of SC97)
P sc97ninf.ps atsuko tukefusu multi-client LAN/WAN performance analysis of ninf : a high-performance global computing system
P sc98USC.pdf soichiro araki user-space communication : a quantitative study
P scaleableserversap.ppt

P sccs-0544.ps sanjay ranka irregular personalized communication on distributed memory machines
P sch_comm.pdf

P sched_comm_smp.ps babak falsafi scheduling communication on an SMP node parallel machine
P SCIENTIFIC.COMP.ARCH.FM.ps william e johnson rationale and strategy for a 21st century scientific computing architecture: the case for using commercial
P

symmetric multiprocessors as supercomputers
P sensitivity-bw-lat-97.ps rajeev barua the sensitivity of communication mechanisms to bandwidth and latency
P sigcomm96.ps david mosberger analysis of techniques to improve protocol processing latency
P sigmetrics.ps remzi h arpaci ther interaction of parallel and sequential workloads on a network of workstations
P sigmetrics97-paper.ps aaron b brown operating system benchmarking in the wake of lmbench: a case study of the performance of NetBSD on the
P

x86 architecture
P sivaram_PPS98.pdf rajeev sivaram HIPIQS: a high-performance switch architecture using input queuing
P smp.ps

P SMP-OSF.ps jeffrey m denbam DEC OSF/1 version 3.0 symmetric multiprocessing implementation
P softw_VMMC.ps cezary dubnicki software support for virtual memory-mapped communication
P solarissmo.ps

P sort.ps andrea c dusseau fast parallel sorting under LogP: experience with the CM-5
P Sort_Logp.ps andrea carol dusseau modeling parallel sorts with LogP on the CM-5
P sosp.pdf thorsten von eicken U-Net : a user-level network interface for parallel and distributed computing
P sosp16.ps armando fox cluster-based scalable network services
P SOSP95-oschar.ps mendel rosenblum the impact of architectural trends on operating system performance
P sosp97.ps hermann hartig the performance of u- kernel-based systems
P spaa96.ps robert d blumofe an analysis of dag-consistent distributed shared-memory algorithms
P spinlock.pdf anna r karlin empirical studies of competitive spinning for a shared-memory multiprocessor
P splc96_pap.ps vasilios georgitsis message passing performance on SP systems
P SRC-1997-016a.ps jennifer m anderson continuous profiling : where have all the cycles gone ?
P stability.ps jonathan md hill stability of communication performance in practice : from the cray T3E to networks of workstations
P stat9514.ps.gz r alexander modelling self-similar network traffic
P stott_FTCS97.pdf david t stott dependability analysis of a commerical high-speed network
P super95.ps andrew s tanenbaum a comparision of three microkernels
P sw_lan.pdf wenjian Qiao network planning and trning in switch-based LANs
P tab.gif

P tcpip.ps stephen r donaldson predictable communication on unpredictable networks : implementing BSP over TCP/IP
P Texas.pdf david patterson intelligent RAM(IRAM): chips that remember and compute
P tezuka-hpcn97.ps hiroshi tezuka PM: an operating system coordinated high performance communication library
P tezuka-ipps98.ps hiroshi tezuka pin-down cache : a virtual memory management technique for zero-copy communication
P TOMPI.ps erik d demaine a threads-only MPI implementation for the development of parallel programs
P Tools94.ps daniel a reed experimental analysis of parallel systems : techniques and open problems
P top500_9806.ps jack j dongarra TOP500 supercomputer sites 11th edition
P tr_95-12.ps swamy s kocherlakota predicting the performance of a wormhole-routed multicomputer with non-uniform communication
P

technical report CPS-95-12
P tcpip.ps stephen r donaldson predictable communication on unpredictable networks : implementing BSP over TCP/IP
P TR44.ps matt jacunski all-to-all broadcast on switch-based clusters of workstations
P TR93-04.ps michael a pagels cache and TLB effectiveness in the processing of network data
P tr95-02.ps dr roberto togneri parallel program analysis on workstation clusters : speedup profiling and latency hiding
P TR95-273.ps saurab nog a performance comparison of TCP/IP and MPI on FDDI, fast ethernet and ethernet
P tr96015.ps hiroshi tezuka PM: a high-performance communication library for multi-user parallel environments
P TR96-03.ps david mosberger analysis of techniques to improve protocol processing latency
P TR-96-04-01.ps yong yan an effective and practical performance prediction model for parallel computing on non-dedicated
P

heterogeneous NOW
P tr97006.ps hiroshi tezuka pin-down cache : a virtual memory management technique for zero-copy communication
P TR-97-1.ps xing du characterizing communication interactions of parallel and sequential jobs on networks of workstations
P tr97-11.ps c greg plaxton accessing nearby copies of replicated objects in a distributed environment
P TR-97-3.ps xing du coordinating parallel processes on networks of workstations
P transbsp.ps db skillicorn multiprogramming BSP programs
P transpose_99.ps christina christara an efficient transposition algorithm for distributed memory computers
P trapeze.ps kenneth g yocum cut-through delivery in trapeze : an exercise in low-latency messaging
P two-case-delivery.pdf kenneth mackenzie exploiting two-case delivery for fast protected messaging
P ucb.ps brent n chun virtual network transport protocols for myrinet
P udp.ps stephen r donaldson predictable communication on unpredictable networks : implementing BSP over TCP/IP and UDP/IP
P unetmm.pdf matt welsh incorporating memory management into user-level network interfaces
P unet-sle.ps david oppenheimer user customization of virtual network interfaces with U-Net/SLE
P unixc-30.pdf budi rahardjo summary of UNIX commands
P Unrau_etal_OSDI94.ps ronald c unrau experiences with locking in a NUMA multiprocessor operating system kernel
P usenix96.ps bj murphy an analysis of process and memory models to support high-speed networking in a UNIX environment
P usenix-w93.ps kevin fall exploiting in -kernel data paths to improve I/O throughput and CPU availability
P user_level_protc chris maeda protocol service decomposition for high-performance networking
P util_profile_partitioning_MP.ps john d evans using utilization profiles in allocation and partitioning for multiprocessor systems
P uw-cse-93-03-01.ps chandramohan a thekkath implementing network protocols at user level
P UW-CSE-94-07-04.PS chandramohan a thekkath separating data and control transfer in distributed operating systems
P via.ps dave dunning the virtual interface architecture
P vinoski.ps steve vinoski CORBA: integrating diverse applications within distributed heterogeneous environments
P webos.pdf amin vahdat webOS: operating system services for wide area applications
P Wilton_Vranesic_SPDP.ps steven je wilton architectural support for block transfers in a shared-memory multiprocessor
P wks.ps v boudet algorithmic issues for ( distributed ) heterogeneous computing platforms
P wp-solaris2_6.pdf
sun solaris operating environment
P WRL-TN-16.ps jeffrey c mogul the effect of context switches on cache performance
P wucs-95-06.ps raman gopalakrishna real-time upcalls: a mechanism to provide real-time processing guarantees
P wucs-96-11.ps r gopalakrishnan efficient user space protocol implementations with QoS guarantees using real-time upcalls
P Yang_ICOIN98.pdf muh-rong yang the design of a very large high performance gigabit switch with shared buffers
P YATE95_NETWORKING.PS david j yates networking support for large scale multiprocessor servers
P ZHAN95_CALL_ADM.PS zhi-li zhang call admission control schemes under generalized processor sharing scheduling
P zounds.ps ron minnich zounds: zero overhead unified network Dsm system
P AroraLM94.ps.gz S. Arora On-line Algorithms for Path Selection in a Nonblocking Network