Discussion:
Question about INSERT OVERWRITE TABLE with dynamic partition
a***@yahoo.com
2018-10-24 04:34:22 UTC
Permalink
Dears,
I found an interesting thing.
When inserting a NULL result into a partition which already contained some records, there was a difference in the results between using static partition INSERT and using dynamic partition INSERT.
See the example below:
Partition '20180101' of table A contained 100 records.
By using
we can delete the records in partition '20180101'.
But by using
there would be no change to the partition '20180101'.
In fact, if we running 'select * from A where partition_A = '20180101'' , we will still get 100 records from it.
Expecting an explanation for it.
Thanks!



孙志犹
Dheena Dhayalan
2018-10-24 09:36:59 UTC
Permalink
unsubscribe

On Wed, Oct 24, 2018 at 10:08 AM ***@yahoo.com <***@yahoo.com>
wrote:

> Dears,
> I found an interesting thing.
> When inserting a NULL result into a partition which already contained
> some records, there was a difference in the results between using static
> partition INSERT and using dynamic partition INSERT.
> See the example below:
> Partition '20180101' of table A contained 100 records.
> By using
> we can delete the records in partition '20180101'.
> But by using
> there would be no change to the partition '20180101'.
> In fact, if we running 'select * from A where partition_A =
> '20180101'' , we will still get 100 records from it.
> Expecting an explanation for it.
> Thanks!
>
> ------------------------------
> 孙志犹
>
Lefty Leverenz
2018-10-28 04:32:04 UTC
Permalink
Dheena, to unsubscribe please send a message to
user-***@hive.apache.org as described here: Mailing Lists
<http://hive.apache.org/mailing_lists.html>.


Thanks. -- Lefty


---------- Forwarded message ---------
From: Dheena Dhayalan <***@vavia.in>
Date: Wed, Oct 24, 2018 at 5:37 AM
Subject: Re: Question about INSERT OVERWRITE TABLE with dynamic partition
To: <***@hive.apache.org>


unsubscribe

On Wed, Oct 24, 2018 at 10:08 AM ***@yahoo.com <***@yahoo.com>
wrote:

> Dears,
> I found an interesting thing.
> When inserting a NULL result into a partition which already contained
> some records, there was a difference in the results between using static
> partition INSERT and using dynamic partition INSERT.
> See the example below:
> Partition '20180101' of table A contained 100 records.
> By using
> we can delete the records in partition '20180101'.
> But by using
> there would be no change to the partition '20180101'.
> In fact, if we running 'select * from A where partition_A =
> '20180101'' , we will still get 100 records from it.
> Expecting an explanation for it.
> Thanks!
>
> ------------------------------
> 孙志犹
>
Tanvi Thacker
2018-10-25 00:34:19 UTC
Permalink
A logical explanation could be:-
In the first query, you are telling hive which partition to overwrite, so a
step which actually deletes the partition data and overwrites it with the
query result, knows that which partition to delete and there is an empty
result/file to move.

but for the second query, Dynamic partition step needs to deduce
partition name from the query result, but as your query is not producing
any row, there is no info of the partition to take action on.

Regards,
Tanvi Thacker

On Tue, Oct 23, 2018 at 9:38 PM ***@yahoo.com <***@yahoo.com>
wrote:

> Dears,
> I found an interesting thing.
> When inserting a NULL result into a partition which already contained
> some records, there was a difference in the results between using static
> partition INSERT and using dynamic partition INSERT.
> See the example below:
> Partition '20180101' of table A contained 100 records.
> By using
> we can delete the records in partition '20180101'.
> But by using
> there would be no change to the partition '20180101'.
> In fact, if we running 'select * from A where partition_A =
> '20180101'' , we will still get 100 records from it.
> Expecting an explanation for it.
> Thanks!
>
> ------------------------------
> 孙志犹
>
a***@yahoo.com
2018-10-26 03:25:01 UTC
Permalink
Thanks, I think it's the proper explanation. For the query result in the second query is null, there won't be a partition name generated in dynamic partition step, so the system doesn't know which partition to overwrite.
Thanks very much!


Regards,
孙志犹

From: Tanvi Thacker
Date: 2018-10-25 08:34
To: user
Subject: Re: Question about INSERT OVERWRITE TABLE with dynamic partition
A logical explanation could be:-
In the first query, you are telling hive which partition to overwrite, so a step which actually deletes the partition data and overwrites it with the query result, knows that which partition to delete and there is an empty result/file to move.

but for the second query, Dynamic partition step needs to deduce partition name from the query result, but as your query is not producing any row, there is no info of the partition to take action on.

Regards,
Tanvi Thacker

On Tue, Oct 23, 2018 at 9:38 PM ***@yahoo.com <***@yahoo.com> wrote:
Dears,
I found an interesting thing.
When inserting a NULL result into a partition which already contained some records, there was a difference in the results between using static partition INSERT and using dynamic partition INSERT.
See the example below:
Partition '20180101' of table A contained 100 records.
By using
we can delete the records in partition '20180101'.
But by using
there would be no change to the partition '20180101'.
In fact, if we running 'select * from A where partition_A = '20180101'' , we will still get 100 records from it.
Expecting an explanation for it.
Thanks!



孙志犹
Loading...