您的位置:首页 > 数据库

PostgreSQL在何处处理 sql查询之二十七

2013-05-29 10:46 399 查看
接前面,仔细看这个 :这 add_base_rels_to_query 是个递归调用嘛。

想像一下: select * from tst01 where id in (select sid from tst02) or id in (select sid from tst03) 之类的,此函数将层层深入,构造一个二叉树式样的语法树。

void
add_base_rels_to_query(PlannerInfo *root, Node *jtnode)
{
if (jtnode == NULL)
return;
if (IsA(jtnode, RangeTblRef))
{
int    varno = ((RangeTblRef *) jtnode)->rtindex;
(void) build_simple_rel(root, varno, RELOPT_BASEREL);
}
else if (IsA(jtnode, FromExpr))
{
FromExpr   *f = (FromExpr *) jtnode;
ListCell   *l;

foreach(l, f->fromlist)
add_base_rels_to_query(root, lfirst(l));
}
else if (IsA(jtnode, JoinExpr))
{
JoinExpr   *j = (JoinExpr *) jtnode;

add_base_rels_to_query(root, j->larg);
add_base_rels_to_query(root, j->rarg);
}
else
elog(ERROR, "unrecognized node type: %d",
(int) nodeTag(jtnode));
}


这个时候,我想到一个问题,在PostgreSQL中,有没有并行查询的可能呢?

那么,再上溯到更高的层面,再次梳理一下:

exec_simple_query中调用 了 plan_tree_list,从语法分析树得到了计划树...执行 PortalRun。

而上面的 add_base_rels_to_query 是plan_tree_list 里的一部分。

调用关系如下:

plan_tree_list -->pg_plan_queries-->pg_plan_query-->planner

planner-->subquery_planner-->groupingplanner

groupingplanner-->query_planner-->add_base_rels_to_query

/*
* exec_simple_query
*
* Execute a "simple Query" protocol message.
*/
static void
exec_simple_query(const char *query_string)
{
...
isTopLevel = (list_length(parsetree_list) == 1);
...
foreach(parsetree_item, parsetree_list)
{
...
querytree_list = pg_analyze_and_rewrite(parsetree, query_string,
NULL, 0);

plantree_list = pg_plan_queries(querytree_list, 0, NULL);
...

/*
* Create unnamed portal to run the query or queries in. If there
* already is one, silently drop it.
*/
portal = CreatePortal("", true, true);
/* Don't display the portal in pg_cursors */
portal->visible = false;

/*
* We don't have to copy anything into the portal, because everything
* we are passing here is in MessageContext, which will outlive the
* portal anyway.
*/
PortalDefineQuery(portal,
NULL,
query_string,
commandTag,
plantree_list,
NULL);

/*
* Start the portal.
*
* If we took a snapshot for parsing/planning, the portal may be able
* to reuse it for the execution phase.  Currently, this will only
* happen in PORTAL_ONE_SELECT mode.  But even if PortalStart doesn't
* end up being able to do this, keeping the parse/plan snapshot
* around until after we start the portal doesn't cost much.
*/
PortalStart(portal, NULL, 0, snapshot_set);
...

/*
* Run the portal to completion, and then drop it (and the receiver).
*/
(void) PortalRun(portal,
FETCH_ALL,
isTopLevel,
receiver,
receiver,
completionTag);

(*receiver->rDestroy) (receiver);

PortalDrop(portal, false);

...
}                            /* end loop over parsetrees */

/*
* Close down transaction statement, if one is open.
*/
finish_xact_command();
...
}


加入调试信息,看看一个带子查询的SQL,能否有并行查询的可能。

[postgres@lex pgsql]$ ./bin/psql
psql (9.2.1)
Type "help" for help.

postgres=# select * from tst01 where id IN (select sid from tst02) or id IN (select sid from tst03);
id
-----
100
200
(2 rows)

postgres=#


看后台的信息,知道planner调用了三次,PortalRun只调用了一次。

[postgres@lex pgsql]$ ./bin/pg_ctl -D ./data start
server starting
[postgres@lex pgsql]$ LOG:  database system was shut down at 2013-05-29 10:12:02 CST
LOG:  autovacuum launcher started
LOG:  database system is ready to accept connections
In query_planner.... by process 11796
In get_relation_info ...oid of table is: 16387 ...by process 11796
In query_planner.... by process 11796
In query_planner.... by process 11796
In get_relation_info ...oid of table is: 16390 ...by process 11796
In query_planner.... by process 11796
In query_planner.... by process 11796
In get_relation_info ...oid of table is: 16384 ...by process 11796
In query_planner.... by process 11796
In PortalRun ... by 11796


也就是说,PostgreSQL 没有子查询的并行查询能力了?!

至少目前,它还非常地不完整:

http://postgresql.1045698.n5.nabble.com/Parallel-query-execution-td5740431.html

http://wiki.postgresql.org/wiki/Parallel_Query_Execution

各种集群方案也作不到这点:

http://www.postgresql.org/docs/current/static/different-replication-solutions.html

Many of the above solutions allow multiple servers to handle multiple queries, but none allow a single query to use multiple servers to complete faster.

对比下Oracle的:

http://miracle.blog.51cto.com/255044/147058/
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: