References for Sensornet filesystem
Filesytems - OceanStore, XFS,
Chord, Bayou, CFS,
SFS
Other info: Plaxton's tree location algorithm
OceanStore
Sean Rhea, Chris Wells, Patrick Eaton, Dennis Geels, Ben Zhao, Hakim
Weatherspoon, and John Kubiatowicz, Maintenance-Free
Global Data Storage. IEEE Internet Computing , Vol 5, No 5,
September/October 2001, pp 40-49. (cached)
Components of OceanStore are:
- tapestry - overlay routing and location
- fragmentation, replication, erasure coding
- read-only storage (modify by version creation), naming via GUID
- byzantine update (see Castro & Liskov)
Sean Rhea, Patrick Eaton, Dennis Geels, Hakim Weatherspoon, Ben Zhao,
and
John Kubiatowicz. Pond:
the OceanStore Prototype. Proceedings of the 2nd USENIX Conference
on File and Storage Technologies (FAST '03),
March 2003 (cached)
- uses file update model based on Bayou. no leases or locks -
application must use predicate-guarded updates.
- caching and update routing (???) sec. 3.4
- object implemented as B-tree. metadata appended to top block;
version i+1 is sparse update to version i. (copy-on-write)
- storage overhead = root block size (constant) + erasure coding
rate.
- NFS mapping - file writes trivial, directory writes handled by
predicate if-not-modified-since.
- performance - wide-area read better than NFS (4x), write worse
(3x)
"unlike PAST and CFS, OceanStore is
designed for write sharing"
"in IVY, conflicting writes must be repaired at the application level"
J.
Kubiatowicz, et al. OceanStore: An
Architecture for Global-Scale Persistent Storage. ASPLOS, December
2000. http://citeseer.ist.psu.edu/kubiatowicz00oceanstore.html
(cached)
"in CFS, servers are passive
repositories of information" CFS, not SFS
- Blaze, Cryptographic FileSystem
"promiscuous caching" - data can be cached anywhere and anytime, in
contrast to e.g. NFS and AFS. sort of like XFS
attenuated Bloom filters - BF[d] = union of bloom filters for
nodes at distance d
Plaxton's method for creating randomized hierarchical trees
Ben Zhao, John Kubiatowicz,
and Anthony
Joseph. Tapestry: An infrastructure for fault-tolerant wide-area
location and routing. Technical Report UCB/CSD-01-1141, Computer
Science Division, U. C. Berkeley, April 2001.
http://citeseer.ist.psu.edu/zhao01tapestry.html
(cached)
- description of Plaxton's method. Object root node (fixed
based on
name suffix) and pointer trails from cached copies up to root
node. Can route around failure (hill-climbing algorithm -
"sidestep" on failure of next path up)
- tapestry uses multiple roots per object by running the hash
function (to compute object name) k times, with a different fixed salt
added each time.
- incremental computation of unique root node for an object -
"surrogate routing"
what is the "coupon collector problem"?
Chord is said to provide similar logarithmic hop limits, but with
weaker worst-case guarantees and no correlation to underlying network
distance.
Note - Tapestry doesn't seem to handle low-bandwidth links well.
Or will it? A node on a slow link will appear to be farther from its
neighbors than other better-connected nodes, but will this alter
whether messages get routed through it?
XFS
Thomas E. Anderson, Michael
D. Dahlin,
Jeanna M. Neefe, David A. Patterson, Drew S. Roselli, and Randolph Y.
Wang. Serverless network file systems. In Proceedings of the 15th
Symposium on Operating Systems Principles, pages 109--126, Copper
Mountain Resort, Colorado, December 1995. ACM. http://citeseer.ist.psu.edu/anderson95serverless.html
(cached)
M. D.
Dahlin, R. Y. Wang, T. E. Anderson, D. A. Patterson.
Cooperative Caching: Using Remote Client Memory to Improve File System
Performance.
Proc. First Symposium on Operating Systems Design and
Implementation.
pp. 267-280.
November 1994 http://citeseer.ist.psu.edu/dahlin94cooperative.html
(cached)
Chord
Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek,
and Hari Balakrishnan,
Chord: A Scalable Peer-to-peer Lookup Service for Internet
Applications,
ACM SIGCOMM 2001, San Deigo, CA, August 2001, pp. 149-160 http://doi.acm.org/10.1145/383059.383071
(cached)
http://pdos.lcs.mit.edu/papers/chord:sigcomm01/chord_sigcomm.pdf
Bayou
A. Demers, K. Petersen, M.
Spreitzer, D. Terry, M.M.
Theimer, and B. Welch. The bayou architecture: Support for data sharing
among mobile users. In Proceedings Workshop on Mobile Computing Systems
and Applications. IEEE, December 1994. http://citeseer.ist.psu.edu/demers94bayou.html
(cached)
- client-server, client may not
have space for simultaneous storage of all data it wants access to.
- designed for mixed, possibly very
weak connectivity (modem, floppy)
- node can act as both client and
server, can access replicas if original not available
- writes have a predicate, the dependency set, and an
application-specific procedure, the mergeproc
that is invoked when the dependency set detects a write conflict.
- Ficus and Coda have similar but
simpler mechanisms for resolving conflicts in an application-dependent
way.
- primary server per database,
function is to serialize writes and commit them
- provides both tentative and
committed view of database for reads. reads can also specify a
max age parameter to bound inconsistency.
- anti-entropy protocol (demers et
al) for reconciliation of writes
K. Petersen,
M. Spreitzer, D. Terry, and M. Theimer.
"Bayou: Replicated Database Services for Worldwide Applications." In
Proceedings 7th SIGOPS European Workshop, pp. 275--280, Connemara,
Ireland, Sept. 1996. ACM. http://citeseer.ist.psu.edu/petersen96bayou.html
(cached)
- describes version vectors for maintaining global write commit
ordering among servers
- write protocol also used to add/drop servers from network -
version vectors disambiguate case where two servers disagree about the
presence of a third server in the network.
Karin
Petersen, Mike J. Spreitzer,
Douglas B. Terry, Marvin M. Theimer, and Alan J. Demers. Flexible
update propagation for weakly consistent replication. In Proceedings of
the 16th ACM Symposium on Operating SystemsPrinciples (SOSP-16), Saint
Malo, France, October 1997.
http://citeseer.ist.psu.edu/petersen97flexible.html
D. B. Terry, M. M. Theimer, K. Petersen, A. J. Demers, M. J. Spreitzer,
and C. Hauser, Managing Update Conflicts in Bayou, a Weakly Connected
Replicated Storage System. Proceedings 15th
Symposium on Operating Systems Principles (SOSP-15) , Cooper Mountain,
Colorado, December 1995, pages 172-183.
http://citeseer.ist.psu.edu/564550.html
lgkvx73m-1.pdf
CFS
Dabek, F., Kaashoek, M. F.,
Karger, D., Morris, R., And Stoica, I. Wide-area cooperative storage
with CFS. In
Proceedings of the 18th ACM Symposium on Operating Systems Principles
(SOSP '01) (To appear; Banff, Canada, Oct. 2001). http://citeseer.ist.psu.edu/dabek01widearea.html
(cached)
SFS
Kevin Fu, M. Frans Kaashoek,
and David
Mazieres. Fast and secure distributed read-only file system. In
Proceedings of the 4th USENIX Symposium on Operating Systems Design and
Implementation (OSDI 2000), pages 181--196, San Diego, California,
October 2000. http://citeseer.ist.psu.edu/fu00fast.html
(cached)
David Mazieres , Michael Kaminsky , M. Frans Kaashoek , Emmett
Witchel, Separating key management from file system security,
Proceedings of the seventeenth ACM symposium on Operating systems
principles, p.124-139, December 12-15, 1999, Charleston, South
Carolina, United States http://doi.acm.org/10.1145/319151.319160
(cached)
Plaxton's tree algorithm
C. Plaxton, R. Rajaram, and
A. Richa.
Accessing nearby copies of replicated objects in a distributed
environment. In Proceedings of the Ninth Annual ACM Symposium on
Parallel Algorithms and Architectures, pages 311--320, June 1997.
http://citeseer.ist.psu.edu/plaxton97accessing.html
(cached)
All Cache (COMA)
Hagersten, E., Landin, A., Haridi, S.,
" DDM - A Cache-Only Memory Architecture," IEEE Computer, Vol.
25, No. 9, September 1992, pp. 44-54. (cached)
Anti-entropy, epidemic protocols
Epidemic algorithms for replicated database maintenance, Proceedings of
the sixth annual ACM Symposium on Principles of distributed computing
(PODC). Vancouver, British Columbia, Canada. pp. 1-12, 1987. http://doi.acm.org/10.1145/41840.41841
(cached)
- for gossip algorithms, if distribution of neighbor pick is
spatial with pdf between d^-(D+1) and d^-(2D), where D is
dimensioniality of mesh, then convergence time and traffic are both logN
- for constructing a spatially localized distribution, use a
distribution that is a function of Q(d), where Q(d) is the number of
sites at a distance of d or less. 1/Q(d)^2 or 1/dQ(d) work.
- push-pull gossip is best
provides a mechanism for economizing
update traffic without explicit
knowledge of the network dimensionality. Particularly good for
conserving bandwidth on critical links. (i.e. slow links that
partition the network)
Lazy replication
Rivka Ladin, Barbara Liskov, Liuba Shrira, Sanjay Ghemawat, Providing
high availability using lazy replication. ACM Transactions on Computer
Systems (TOCS), Volume 10 Issue 4 November 1992. http://doi.acm.org/10.1145/138873.138877
(cached)
Liskov - Harp, byzantine fault tolerance
Rivka Ladin, Barbara Liskov,
Liuba Shrira, Sanjay Ghemawat, Providing high availability using lazy
replication. ACM Transactions on Computer Systems (TOCS), Volume 10,
Issue 4 pp 360 - 391, 1992 http://doi.acm.org/10.1145/138873.138877
(cached)
M. Castro and B. Liskov. Practical Byzantine fault
tolerance. In Proceedings of the 3rd USENIX Symposium on Operating
Systems Design and Implementation, February 1999. http://citeseer.ist.psu.edu/castro01practical.html
(cached
TOCS paper) (full
MIT tech report)
Other papers to read
Balachander Krishnamurthy and
Craig E.
Wills. Study of piggyback cache validation for proxy caches in the
world wide web. In Symposium on Internet Technology and Systems. USENIX
Association, December 1997. http://gunther.smeal.psu.edu/684.html
http://www.cs.cornell.edu/Info/Projects/Spinglass/
http://lass.cs.umass.edu/~shenoy/courses/spring04/677/readings.html
http://www.parc.com/research/csl/projects/obje/ - the Obje system
http://www2.parc.com/csl/projects/bayou/
http://citeseer.ist.psu.edu/449381.html
- Eugene
Gendelman,
Lubomir F. Bic, and
Michael B. Dillencourt. LDFS: A Fault-Tolerant Local Disk-Based File
System for Mobile Agents. Department of Information and Computer
Science, University of California, Irvine, CA. TR 01-21.
http://citeseer.ist.psu.edu/449381.html
- S.
Soltis, G.
Erickson, K. Preslan, M.
O'Keefe, and T. Ruwart, "The Global File System: A File System for
Shared Disk Storage," submitted to the IEEE Transactions on Parallel
and Distributed Systems, October 1997.
http://citeseer.ist.psu.edu/soltis97global.html
- Y.
Saito, C.
Karamonolis, M. Karlsson,
and M. Mahalingam. Taming aggressive replication in the Pangaea
wide-area file system. In Proc. of the 5th Symp. on Operating Systems
Design and Implementation. Dec. 2002.
http://citeseer.ist.psu.edu/saito02taming.html
- A.
Adya, W. J. Bolosky, M. Castro, R.
Chaiken, G. Cermak, J. R. Douceur, J. Howell, J. R. Lorch, M. Theimer,
R. P. Wattenhofer, "FARSITE: Federated, Available, and Reliable Storage
for an Incompletely Trusted Environment", 5th OSDI, Dec 2002.
http://citeseer.ist.psu.edu/adya02farsite.html
Spatial databases
Ralf H. Guting. An Introduction to
Spatial Database Systems. The VLDB Journal, 3(4):357--399, 1994. 21
@article{615206,
author = {Ralf Hartmut G\&\#252;ting},
title = {An introduction to spatial database systems},
journal = {The VLDB Journal},
volume = {3},
number = {4},
year = {1994},
issn = {1066-8888},
pages = {357--399},
publisher = {Springer-Verlag New York, Inc.},
}
Distributed database indices
Brigitte Kroll and Peter Widmayer.
Balanced distributed search trees do
not exist. In Algorithms and Data
Structures, 4th International Workshop, volume 955 of Lecture Notes in
Computer Science, pages 50--61, Kingston, Ontario, Canada, 16--18
August 1995. http://citeseer.ist.psu.edu/article/kroll95balanced.html
(cached)
There are a fair number of
papers referenced on the web page of the Data Engineering
Laboratory, Aristotle University, Greece. One of them:
@inproceedings{ bozanis-ldt,
author = "Panayiotis Bozanis and Yannis Manolopoulos",
title = "LDT: a Logarithmic Distributed Search",
booktitle = "Proceedings 4th Workshop on Distributed Data and Structures (WDAS'02)",
city = "Paris",
year = 2002,
url = "citeseer.ist.psu.edu/527887.html" }
A lot of papers show up in VLDB.
Witold Litwin, Marie-Anne Neimat, Donovan A. Schneider:
RP*: "A Family of Order Preserving
Scalable Distributed Data Structures."
VLDB 1994: 342-353 (cached)
S. D. Gribble, E. A. Brewer,
J. M.
Hellerstein, and D. Culler, "Scalable, distributed data structures for
Internet service construction.," in In Symposium on Operating Systems
Design and Implementation, (San Diego, CA), October 2000. http://citeseer.ist.psu.edu/gribble00scalable.html
(cached)
Gnutella
Other
Workshop on Networked Group Communication (2002):
Program -
has links to various people working on P2P etc.
completely unrelated:
J. Hill, R. Szewczyk, A. Woo,
S.
Hollar, D. Culler, K. Pister, "System architecture directions for
networked sensors", submitted for publication.
http://citeseer.ist.psu.edu/382595.htm (cached) This describes the Berkeley
mote
and TinyOS in some detail, and appears to be the first publication on
it.
last
modified 4/1/04, pjd