Kylin assumes the join columns are PK; Then it will take Hashmap snapshot
with the PK as the Key. If the join column isn't unique, it will report
In your case, I think you're join the AUTHORSHIP table with "AUTHOR" as the
condition. While "AUTHOR" isn't unique. Furthermore, join in this way will
cause the data in fact table be appeared multiple times in the flat table,
then it will cause aggregation data wrong. So you need consolidate the data
model first. This is just my comments.
> Oh my God, probably I have to change Hive tables...
> View this message in context: http://apache-kylin.74782.x6.
> Sent from the Apache Kylin mailing list archive at Nabble.com.
I have the "AUTHOR" looking like that:
1001 Shery Denita Dietrich
1002 Cruz Luettgen
1003 Tonita Robertson
1004 Donya Apryl Ryan
1005 Wilhemina Kirstie Phillips
1006 Elisa Lind
1007 Dionna Latarsha Lakin
1008 Annamarie Elizabet Senger
1009 Eddy Orn
1010 Latrisha Powlowski
1011 Tuyet Marvin
1012 Meggan Gwenda Wagner
1013 Heidi Darrick Will
1014 Alissa Emogene Nienow
1015 Felix Tenisha Carr
1016 Carolina Clinton Kling
1017 Stacey Edwin Mann
1018 Graciela Berge
1019 Karol Carol Hackett
1020 Elba Gala Kuvalis
1021 Phyliss Mindi Terry
And so on...
The "BOOK" looks like that:
"ID_BOOK ISBN Type Title Price
13134 0-00-8544-66222-5 dramat The Names's Waves cheap"
and so on...
And the "AUTORSHIP" just takes the ID's of BOOK and AUTHOR and connect them. For example:
One book could be written by more than 1 author, and 1 author could have more than 1 book.
Of course, as you said, I join the Autorship to Autor (by join in Lookup Table tab) and Autorship to Book as well. Is there any way to do it properly? Do I have to connect the data from tables into one table?