Re: [HACKERS] [hackers]development suggestion needed
От | Bruce Momjian |
---|---|
Тема | Re: [HACKERS] [hackers]development suggestion needed |
Дата | |
Msg-id | 200001140109.UAA23412@candle.pha.pa.us обсуждение исходный текст |
Ответ на | [hackers]development suggestion needed (xun@cs.ucsb.edu (Xun Cheng)) |
Список | pgsql-hackers |
> I have background in relational database management system > research and I want to try to be a developer for PostgreSQL. > Right now I only try to be familiar with your code base. I > plan to start with a specific function module in the backend. > I'm thinking of /docs/pgsql/src/backend/executor because > I want to experiment with some new fast join algorithms. > My long term objective is to introduce materialized view > subsystem into PostgreSQL. Could anyone tell me if > the directory /docs/pgsql/src/backend/executor is the > right place to start or just give me some general suggestions > which are not in the FAQs? Oh one more thing I want to > mention is that those join algorithms I want to experiment > with may have some special data access paths similar to an index. Good. > > Further if it doesn't bother you much, could someone > answer the following question(s) for me? (Sorry if > some are already in the docs) > 1. Does postgresql do raw storage device management or it relies > on file system? My impression is no raw device. If no, > is it difficult to add it and possibly how? No, only file system. We don't see much advantage to raw i/o. > 2. Do you have standard benchmark results for postgresql? > I guess not since it only implements a subset of SQL'92. > What about subset of a benchmark or something repeatable? We do the Wisconsin. I think it is in the source tree. > 3. Suppose I have added a new two rel. join algorithm, how > would I proceed to compare the performance of it with > the exisiting two relation join algorithms under > different senarios? Are there any existing facilities > in the current code base for this purpose? Am I right > that the available join algos implemented are nested loop > join (including index-based), hash join (which one? hybrid), > sort-merge join? You can control the join types used with flags to postgres. Very easy. > 4. Usually a single sequential pass of a large joining relation > is preferred to random access in large join operation. > It's mostly because of the current disk access characteristics. > Is it possible for me to do some benchmarking about this > using postgresql? What I'm actually asking are the issues about > how to control the flow of data form disk to buffers, > how to stop file system interference and how to arrange > actual data placement on the disk. Good idea. We deal with this regularly in deciding to use an index in the optimizer or a sequential scan. Our optimizer is quite good. -- Bruce Momjian | http://www.op.net/~candle pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
В списке pgsql-hackers по дате отправления: