Multiple keys.

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Multiple keys.

Bart
This post was updated on .
Hi there!

I wanted to build a cube and I have a log like that in "#4 Step Name: Build Dimension Dictionary":

"java.lang.IllegalStateException: The table: AUTORSHIP Dup key found, key=[13142], value1=[13142,1088], value2=[13142,1918]".

I have tables BOOK and AUTHOR. Connection between BOOK and AUTHOR is "n to n", so I created a table AUTORHSIP between them, to make 1:n and n:1 connections. And that's like that here...

Assumptions:
One AUTHOR can have more than 1 book and 1 BOOK could be written more than 1 author, so I just created an artificial table between them (AUTORSHIP).

Where may I repair that? If I put author to the BOOK, it will be duplicated as well...

Thanks in advance,
Bart
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Multiple keys.

Bart
Oh my God, probably I have to change Hive tables...
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Multiple keys.

shaofengshi
Hi Bart,

Kylin assumes the join columns are PK; Then it will take Hashmap snapshot
with the PK as the Key. If the join column isn't unique, it will report
such error.

In your case, I think you're join the AUTHORSHIP table with "AUTHOR" as the
condition. While "AUTHOR" isn't unique. Furthermore, join in this way will
cause the data in fact table be appeared multiple times in the flat table,
then it will cause aggregation data wrong. So you need consolidate the data
model first. This is just my comments.

2017-08-02 21:46 GMT+08:00 Bart <[hidden email]>:

> Oh my God, probably I have to change Hive tables...
>
> --
> View this message in context: http://apache-kylin.74782.x6.
> nabble.com/Multiple-keys-tp8616p8624.html
> Sent from the Apache Kylin mailing list archive at Nabble.com.
>



--
Best regards,

Shaofeng Shi 史少锋
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Multiple keys.

Bart
Hi!
Thanks for the answer.

The "AUTOR" is unique. I think so...

I have the "AUTHOR" looking like that:
"ID_Autor Name&Surname
1001 Shery Denita Dietrich
1002 Cruz  Luettgen
1003 Tonita  Robertson
1004 Donya Apryl Ryan
1005 Wilhemina Kirstie Phillips
1006 Elisa  Lind
1007 Dionna Latarsha Lakin
1008 Annamarie Elizabet Senger
1009 Eddy  Orn
1010 Latrisha  Powlowski
1011 Tuyet  Marvin
1012 Meggan Gwenda Wagner
1013 Heidi Darrick Will
1014 Alissa Emogene Nienow
1015 Felix Tenisha Carr
1016 Carolina Clinton Kling
1017 Stacey Edwin Mann
1018 Graciela  Berge
1019 Karol Carol Hackett
1020 Elba Gala Kuvalis
1021 Phyliss Mindi Terry
"
And so on...

The "BOOK" looks like that:
"ID_BOOK ISBN Type Title Price
13134 0-00-8544-66222-5 dramat The Names's Waves cheap"
and so on...

And the "AUTORSHIP" just takes the ID's of BOOK and AUTHOR and connect them. For example:

(13134, 1020)

One book could be written by more than 1 author, and 1 author could have more than 1 book.

Of course, as you said, I join the Autorship to Autor (by join in Lookup Table tab) and Autorship to Book as well. Is there any way to do it properly? Do I have to connect the data from tables into one table?
Loading...