Deactivate ehcache by marcaurele · Pull Request #2913 · apache/cloudstack

marcaurele · 2018-10-22T08:16:52Z

Description

This PR is for deactivating Ehcache in CloudStack since it is not usable. The first commit remove the default RMI cache peering configured for multicast which most of the time cannot work. It also requires to have an interface up which is not always the case while developing offline.
The second commits remove the configuration to activate caching on some DAOs.

Problems

The code in CS does not seem to fit any caching mechanism especially due to the homemade DAO code. The main 3 flaws are the following:

Entities are not expected to be shared

There is quite a lot of code with method calls passing entity IDs value as long, which does some object fetching. Without caching, this behavior will create distinct objects each time an entity with the same ID is fetched. With the cache enabled, the same object will be shared among those methods. It has been seen that it does generate some side effects where code still expected unchanged entity attributes after calling different methods thus generating exception/bugs.

DAO update operations are using search queries

Some part of the code are updating entities based on a search query, therefore the whole cache must be invalidated (see GenericDaoBase: public int update(UpdateBuilder ub, final SearchCriteria<?> sc, Integer rows);).

Entities based on views joining multiple tables

There are quite a lot of entities based on SQL views joining multiple entities in a same object. Enabling caching on those would require a mechanism to link and cross-remove related objects whenever one of the sub-entity is changed.

Final word

Based on the previously discussed points, the best approach IMHO would be to move out of the custom DAO framework in CS and use a well known one (out of scope of this change of course). It will handle caching well and the joins made by the views in the code. It's not an easy change, but it will fix along a lot of issues and add a proven / robust framework to an important part of the code.

Types of changes

Breaking change (fix or feature that would cause existing functionality to change)
New feature (non-breaking change which adds functionality)
Bug fix (non-breaking change which fixes an issue)
Enhancement (improves an existing feature and functionality)
Cleanup (Code refactoring and cleanup, that may add test cases)

How Has This Been Tested?

A cluster of management servers with a manual ehcache RMI configuration has been setup to test different changes in the caching. The RMI cache setup has been verified though cache invalidation being propagated to the other server. A series of integration tests have been run, while figuring out the reason for (random) errors.

rafaelweingartner · 2018-10-22T13:46:25Z

@marcaurele as long as you are removing the use of "ehCache", what about removing its dependency in our POM.xml files too?

yadvr · 2018-10-22T14:34:40Z

@marcaurele I only see the RMI config removed, can you instead provide the ehcache configuration to be used that can work out of the box by default? The GenericDaoBase initialises and still uses ehcache's Cache, CacheManager, Element etc.

marcaurele · 2018-10-23T07:00:23Z

@rafaelweingartner @rhtyd Before doing to much work if people would not agree with my findings I simply wanted to push this PR to open the discussion on ehcache topic. I will post another one removing all the code that use ehcache to clean it up if this one is going to be merged.
Are those findings not a surprise to you ? cc @wido

yadvr · 2018-10-23T07:55:53Z

@marcaurele the idea of ehcache was to reduce some load to the db server. Can you start a discussion thread on dev@ with details like what and how you'll replace ehcache. And your long term plan re:DB refactoring, the goals, proposed timeline etc.

marcaurele · 2018-10-23T08:08:21Z

@rhtyd sure

wido · 2018-10-23T10:18:46Z

I have seen some issues with ehcache as well on mgmt servers, we disabled it often on deployment.

marcaurele · 2018-10-30T15:50:03Z

@wido do you use some custom configuration, or do you simply keep the one provided by default? My next move will be to remove all the code related to ehcache in CS. Is that ok for you?

wido · 2018-10-30T16:01:17Z

@marcaurele We just disable it in the XML, we do not use any custom ones.

yadvr · 2019-05-27T12:41:59Z

@blueorangutan package

blueorangutan · 2019-05-27T12:42:15Z

@rhtyd a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress.

blueorangutan · 2019-05-27T14:47:20Z

Packaging result: ✔centos6 ✔centos7 ✔debian. JID-2794

yadvr · 2019-05-27T14:58:54Z

@blueorangutan test

blueorangutan · 2019-05-27T14:59:18Z

@rhtyd a Trillian-Jenkins test job (centos7 mgmt + kvm-centos7) has been kicked to run smoke tests

blueorangutan · 2019-05-28T01:35:05Z

Trillian test result (tid-3593)
Environment: kvm-centos7 (x2), Advanced Networking with Mgmt server 7
Total time taken: 36051 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr2913-t3593-kvm-centos7.zip
Intermittent failure detected: /marvin/tests/smoke/test_internal_lb.py
Intermittent failure detected: /marvin/tests/smoke/test_privategw_acl.py
Intermittent failure detected: /marvin/tests/smoke/test_vpc_redundant.py
Smoke tests completed. 70 look OK, 0 have error(s)
Only failed tests results shown below:

Test	Result	Time (s)	Test File

DaanHoogland · 2019-05-28T08:35:40Z

code and tests are good. are we going on with this, @wido @rafaelweingartner @rhtyd @ustcweizhou ? /cc @marcaurele

ustcweizhou · 2019-05-28T09:44:59Z

LGMT

yadvr · 2019-06-04T15:57:56Z

I'm in favour of merging this as the ehcache does not work with currently CloudStack at all, all queries end up on the mysql server.
What do others think - @wido @rafaelweingartner @DaanHoogland ?

wido

LGTM

yadvr · 2019-06-05T10:57:55Z

All in agreement, let's merge this!

Atualizar _labels_ automaticamente ao atualizar entre versões Closes apache#2784 and apache#2913 See merge request scclouds/scclouds!1289

marcaurele added 2 commits October 22, 2018 09:45

ehcache: remove default active configuration for RMI cache peering

7b3d326

ehcache: deactivate caching for DAOs

19c268c

rafaelweingartner assigned marcaurele Oct 22, 2018

rafaelweingartner added the type:enhancement label Oct 22, 2018

rafaelweingartner added this to the 4.12.0.0 milestone Oct 22, 2018

marcaurele mentioned this pull request Nov 1, 2018

Remove api rate limiting plugin #2993

Closed

5 tasks

rafaelweingartner modified the milestones: 4.12.0.0, 5.0.0.0 Jan 9, 2019

yadvr approved these changes May 27, 2019

View reviewed changes

wido self-requested a review June 4, 2019 19:01

wido approved these changes Jun 4, 2019

View reviewed changes

rafaelweingartner approved these changes Jun 5, 2019

View reviewed changes

DaanHoogland approved these changes Jun 5, 2019

View reviewed changes

yadvr merged commit c5f0844 into apache:master Jun 5, 2019

Conversation

marcaurele commented Oct 22, 2018

Description

Problems

Entities are not expected to be shared

DAO update operations are using search queries

Entities based on views joining multiple tables

Final word

Types of changes

How Has This Been Tested?

Uh oh!

rafaelweingartner commented Oct 22, 2018

Uh oh!

yadvr commented Oct 22, 2018

Uh oh!

marcaurele commented Oct 23, 2018

Uh oh!

yadvr commented Oct 23, 2018

Uh oh!

marcaurele commented Oct 23, 2018

Uh oh!

wido commented Oct 23, 2018

Uh oh!

marcaurele commented Oct 30, 2018

Uh oh!

wido commented Oct 30, 2018

Uh oh!

yadvr commented May 27, 2019

Uh oh!

blueorangutan commented May 27, 2019

Uh oh!

blueorangutan commented May 27, 2019

Uh oh!

yadvr commented May 27, 2019

Uh oh!

blueorangutan commented May 27, 2019

Uh oh!

blueorangutan commented May 28, 2019

Uh oh!

DaanHoogland commented May 28, 2019

Uh oh!

ustcweizhou commented May 28, 2019

Uh oh!

yadvr commented Jun 4, 2019

Uh oh!

wido left a comment

Choose a reason for hiding this comment

Uh oh!

yadvr commented Jun 5, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants