Fix thrift client connection for Kerberos Hive Client #1747
Fix thrift client connection for Kerberos Hive Client #1747Fokko merged 6 commits intoapache:mainfrom
Conversation
hussein-awala
left a comment
There was a problem hiding this comment.
IMHO, it would be better to define the client as a cached_property, and just return it.
+1 for moving the init out of the __init__ func, and converting the client to a cached_property will help to do this, where the user will be able to use the client with and without a context manager.
|
@kevinjqliu I think this one is also good for 0.9.1 |
|
Gentle ping @kevinjqliu. Any thoughts on the |
Hi @hussein-awala, just to make sure I have this right, would this mean moving most of the logic in Edit: I'm asking because there might be cases where the above could actually result in a user invoking methods on a client whose underlying transport has been closed. For example: hive_client = _HiveClient(...)
p_client = hive_client.client
with hive_client as open_client:
print(open_client.get_all_databases())
print(p_client.get_all_databases()) # Results in TTransportException: Transport not openSo its likely that this is not what you meant. |
|
Thanks for chiming in @hussein-awala @mnzpk. Please take a look at the new implementation. The context manager ( @mnzpk could you give this a try? |
6b8d2ee to
7b21b5b
Compare
|
CI's currently failing for main branch, see https://github.com/apache/iceberg-python/pull/1899/files#r2040915222
|
7b21b5b to
d788c8d
Compare
|
cool CI passes now @mnzpk please take a look when you have time :) |
Fokko
left a comment
There was a problem hiding this comment.
Thanks for fixing this @kevinjqliu 🙌
Closes #1744 `TSaslClientTransport` cannot be reopen. This PR changes the behavior to recreate a `TSaslClientTransport` when its already closed. Note, `_HiveClient` should be used with context manager, but can be used without.
|
I was getting this exact error. Thanks for fixing this issue, but any idea how can I provide kerberos related details while creating catalog, like kerberos principal , kerberos_keytab, kerberos_service_name, kerberos_user ? I tried searching the documentation but could not find anything except this "hive.kerberos-authentication": "true". |
|
@abhisheksinha-pty im not really sure how those params are passed into hive/kerberos. Here is where we create the hive client, i would assume you have to pass the params in there somehow. Do you know how its passed into a regular hive client? |
mnzpk
left a comment
There was a problem hiding this comment.
Apologies for not reviewing this sooner but unfortunately, installing from main, I can still reproduce the reported issue. I've left a few comments with what I think would fix that. I've also added a few tests here that would allow testing this without having kerberos auth or a kerberized metastore instance set up. Not sure how useful you think those would be but let me know!
Thanks so much for all your work here.
| """Make sure the transport is initialized and open.""" | ||
| if not self._transport.isOpen(): | ||
| try: | ||
| self._transport.open() |
There was a problem hiding this comment.
It seems that we're still going to try re-opening the transport once it has been closed and since the exception raised in that case would be a TypeError, it would not be caught by the except below.
| self._transport.open() | ||
| except TTransport.TTransportException: | ||
| # reinitialize _transport | ||
| self._transport = self._init_thrift_transport() |
There was a problem hiding this comment.
We're re-initializing the transport but since self._client is a cached_property, it'd still point to the old (and now closed) transport so I think we'd also want to delete self._client so that it gets re-created?
<!--
Thanks for opening a pull request!
-->
<!-- In the case this PR will resolve an issue, please replace
${GITHUB_ISSUE_ID} below with the actual Github issue id. -->
<!-- Closes #${GITHUB_ISSUE_ID} -->
Closes #1744 (second try)
# Rationale for this change
First try (#1747) did not fully resolve the issue. See
#1747 (review)
# Are these changes tested?
yes
# Are there any user-facing changes?
<!-- In the case of user-facing changes, please add the changelog label.
-->
---------
Co-authored-by: mnzpk <84433140+mnzpk@users.noreply.github.com>
<!--
Thanks for opening a pull request!
-->
<!-- In the case this PR will resolve an issue, please replace
${GITHUB_ISSUE_ID} below with the actual Github issue id. -->
<!-- Closes #${GITHUB_ISSUE_ID} -->
Closes #1744 (second try)
First try (#1747) did not fully resolve the issue. See
#1747 (review)
yes
<!-- In the case of user-facing changes, please add the changelog label.
-->
---------
Co-authored-by: mnzpk <84433140+mnzpk@users.noreply.github.com>
Closes apache#1744 `TSaslClientTransport` cannot be reopen. This PR changes the behavior to recreate a `TSaslClientTransport` when its already closed. Note, `_HiveClient` should be used with context manager, but can be used without.
<!--
Thanks for opening a pull request!
-->
<!-- In the case this PR will resolve an issue, please replace
${GITHUB_ISSUE_ID} below with the actual Github issue id. -->
<!-- Closes #${GITHUB_ISSUE_ID} -->
Closes apache#1744 (second try)
# Rationale for this change
First try (apache#1747) did not fully resolve the issue. See
apache#1747 (review)
# Are these changes tested?
yes
# Are there any user-facing changes?
<!-- In the case of user-facing changes, please add the changelog label.
-->
---------
Co-authored-by: mnzpk <84433140+mnzpk@users.noreply.github.com>
<!--
Thanks for opening a pull request!
-->
<!-- In the case this PR will resolve an issue, please replace
${GITHUB_ISSUE_ID} below with the actual Github issue id. -->
<!-- Closes #${GITHUB_ISSUE_ID} -->
Closes apache#1744 (second try)
First try (apache#1747) did not fully resolve the issue. See
apache#1747 (review)
yes
<!-- In the case of user-facing changes, please add the changelog label.
-->
---------
Co-authored-by: mnzpk <84433140+mnzpk@users.noreply.github.com>
Closes #1744
TSaslClientTransportcannot be reopen. This PR changes the behavior to recreate aTSaslClientTransportwhen its already closed.Note,
_HiveClientshould be used with context manager, but can be used without.