Skip to content

Deactivate ehcache#2913

Merged
yadvr merged 2 commits intoapache:masterfrom
marcaurele:remove-ehcache
Jun 5, 2019
Merged

Deactivate ehcache#2913
yadvr merged 2 commits intoapache:masterfrom
marcaurele:remove-ehcache

Conversation

@marcaurele
Copy link
Member

Description

This PR is for deactivating Ehcache in CloudStack since it is not usable. The first commit remove the default RMI cache peering configured for multicast which most of the time cannot work. It also requires to have an interface up which is not always the case while developing offline.
The second commits remove the configuration to activate caching on some DAOs.

Problems

The code in CS does not seem to fit any caching mechanism especially due to the homemade DAO code. The main 3 flaws are the following:

Entities are not expected to be shared

There is quite a lot of code with method calls passing entity IDs value as long, which does some object fetching. Without caching, this behavior will create distinct objects each time an entity with the same ID is fetched. With the cache enabled, the same object will be shared among those methods. It has been seen that it does generate some side effects where code still expected unchanged entity attributes after calling different methods thus generating exception/bugs.

DAO update operations are using search queries

Some part of the code are updating entities based on a search query, therefore the whole cache must be invalidated (see GenericDaoBase: public int update(UpdateBuilder ub, final SearchCriteria<?> sc, Integer rows);).

Entities based on views joining multiple tables

There are quite a lot of entities based on SQL views joining multiple entities in a same object. Enabling caching on those would require a mechanism to link and cross-remove related objects whenever one of the sub-entity is changed.

Final word

Based on the previously discussed points, the best approach IMHO would be to move out of the custom DAO framework in CS and use a well known one (out of scope of this change of course). It will handle caching well and the joins made by the views in the code. It's not an easy change, but it will fix along a lot of issues and add a proven / robust framework to an important part of the code.

Types of changes

  • Breaking change (fix or feature that would cause existing functionality to change)
  • New feature (non-breaking change which adds functionality)
  • Bug fix (non-breaking change which fixes an issue)
  • Enhancement (improves an existing feature and functionality)
  • Cleanup (Code refactoring and cleanup, that may add test cases)

How Has This Been Tested?

A cluster of management servers with a manual ehcache RMI configuration has been setup to test different changes in the caching. The RMI cache setup has been verified though cache invalidation being propagated to the other server. A series of integration tests have been run, while figuring out the reason for (random) errors.

@rafaelweingartner
Copy link
Member

@marcaurele as long as you are removing the use of "ehCache", what about removing its dependency in our POM.xml files too?

@yadvr
Copy link
Member

yadvr commented Oct 22, 2018

@marcaurele I only see the RMI config removed, can you instead provide the ehcache configuration to be used that can work out of the box by default? The GenericDaoBase initialises and still uses ehcache's Cache, CacheManager, Element etc.

@marcaurele
Copy link
Member Author

@rafaelweingartner @rhtyd Before doing to much work if people would not agree with my findings I simply wanted to push this PR to open the discussion on ehcache topic. I will post another one removing all the code that use ehcache to clean it up if this one is going to be merged.
Are those findings not a surprise to you ? cc @wido

@yadvr
Copy link
Member

yadvr commented Oct 23, 2018

@marcaurele the idea of ehcache was to reduce some load to the db server. Can you start a discussion thread on dev@ with details like what and how you'll replace ehcache. And your long term plan re:DB refactoring, the goals, proposed timeline etc.

@marcaurele
Copy link
Member Author

@rhtyd sure

@wido
Copy link
Contributor

wido commented Oct 23, 2018

I have seen some issues with ehcache as well on mgmt servers, we disabled it often on deployment.

@marcaurele
Copy link
Member Author

@wido do you use some custom configuration, or do you simply keep the one provided by default? My next move will be to remove all the code related to ehcache in CS. Is that ok for you?

@wido
Copy link
Contributor

wido commented Oct 30, 2018

@marcaurele We just disable it in the XML, we do not use any custom ones.

@marcaurele marcaurele mentioned this pull request Nov 1, 2018
5 tasks
@rafaelweingartner rafaelweingartner modified the milestones: 4.12.0.0, 5.0.0.0 Jan 9, 2019
@yadvr
Copy link
Member

yadvr commented May 27, 2019

@blueorangutan package

@blueorangutan
Copy link

@rhtyd a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress.

@blueorangutan
Copy link

Packaging result: ✔centos6 ✔centos7 ✔debian. JID-2794

@yadvr
Copy link
Member

yadvr commented May 27, 2019

@blueorangutan test

@blueorangutan
Copy link

@rhtyd a Trillian-Jenkins test job (centos7 mgmt + kvm-centos7) has been kicked to run smoke tests

@blueorangutan
Copy link

Trillian test result (tid-3593)
Environment: kvm-centos7 (x2), Advanced Networking with Mgmt server 7
Total time taken: 36051 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr2913-t3593-kvm-centos7.zip
Intermittent failure detected: /marvin/tests/smoke/test_internal_lb.py
Intermittent failure detected: /marvin/tests/smoke/test_privategw_acl.py
Intermittent failure detected: /marvin/tests/smoke/test_vpc_redundant.py
Smoke tests completed. 70 look OK, 0 have error(s)
Only failed tests results shown below:

Test Result Time (s) Test File

@DaanHoogland
Copy link
Contributor

code and tests are good. are we going on with this, @wido @rafaelweingartner @rhtyd @ustcweizhou ? /cc @marcaurele

@ustcweizhou
Copy link
Contributor

LGMT

@yadvr
Copy link
Member

yadvr commented Jun 4, 2019

I'm in favour of merging this as the ehcache does not work with currently CloudStack at all, all queries end up on the mysql server.
What do others think - @wido @rafaelweingartner @DaanHoogland ?

@wido wido self-requested a review June 4, 2019 19:01
Copy link
Contributor

@wido wido left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@yadvr
Copy link
Member

yadvr commented Jun 5, 2019

All in agreement, let's merge this!

@yadvr yadvr merged commit c5f0844 into apache:master Jun 5, 2019
Tonitzpp pushed a commit to scclouds/cloudstack that referenced this pull request Dec 18, 2025
Atualizar _labels_ automaticamente ao atualizar entre versões

Closes apache#2784 and apache#2913

See merge request scclouds/scclouds!1289
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants