While open research issues remain in the area of parallel database machines
for relational database systems, building a highly parallel database machine for
an object-oriented database system presents a number of new challenges.
One of the first issues to resolve is how declustering should be handled. For
example, should one decluster all sets (such as set-valued attributes of a
complex object) or just top-level sets? Another question is how should inter
object references be handled. In a relational database machine, such references
are handled by doing a join between the two relations of interest, but in an
object-oriented DBMS references are generally handled via pointers.
In particular, a tension exists between declustering a set in order to parallelize
scan operations on that set and clustering an object and the objects it references
in order to reduce the number of disk accesses necessary to access the
components of a complex object. Since clustering in a standard object-oriented
database system remains an open research issue, mixing in declustering makes
the problem even more challenging.
Another open area is parallel query processing in an OODBMS. Most OODBMS
provide a relational-like query language based on an extension to relational algebra.
While it is possible to parallelize these operators, how should class-specific methods
be handled? If the method operates on a single object it is certainly not worthwhile
parallelizing it However, if the method operates on a set of values or objects that are
declustered, then it almost must be parallelized if one is going to avoid moving all the
data referenced to a single processor for execution. Since it is, at this point in time,
impossible to parallelize arbitrary method code, one possible solution might be to
insist that if a method is to be parallelized that it be constructed using the primitives
from the underlying algebra, perhaps embedded in a normal programming language.
for relational database systems, building a highly parallel database machine for
an object-oriented database system presents a number of new challenges.
One of the first issues to resolve is how declustering should be handled. For
example, should one decluster all sets (such as set-valued attributes of a
complex object) or just top-level sets? Another question is how should inter
object references be handled. In a relational database machine, such references
are handled by doing a join between the two relations of interest, but in an
object-oriented DBMS references are generally handled via pointers.
In particular, a tension exists between declustering a set in order to parallelize
scan operations on that set and clustering an object and the objects it references
in order to reduce the number of disk accesses necessary to access the
components of a complex object. Since clustering in a standard object-oriented
database system remains an open research issue, mixing in declustering makes
the problem even more challenging.
Another open area is parallel query processing in an OODBMS. Most OODBMS
provide a relational-like query language based on an extension to relational algebra.
While it is possible to parallelize these operators, how should class-specific methods
be handled? If the method operates on a single object it is certainly not worthwhile
parallelizing it However, if the method operates on a set of values or objects that are
declustered, then it almost must be parallelized if one is going to avoid moving all the
data referenced to a single processor for execution. Since it is, at this point in time,
impossible to parallelize arbitrary method code, one possible solution might be to
insist that if a method is to be parallelized that it be constructed using the primitives
from the underlying algebra, perhaps embedded in a normal programming language.
No comments:
Post a Comment